Backend

Implementing Search with Elasticsearch

Mayur Dabhi

May 13, 2026

15 min read

When your application grows to millions of records, SQL LIKE queries just don't cut it. They're slow, they don't understand relevance, and they can't handle typos, synonyms, or complex filtering. Elasticsearch is the industry-standard solution: a distributed, RESTful search engine built on Apache Lucene that powers search at companies like GitHub, Wikipedia, Netflix, and Uber. In this guide you'll go from zero to a production-ready search implementation — covering installation, index design, query DSL, aggregations, and full PHP integration.

Why Elasticsearch Over SQL LIKE?

A SELECT * FROM products WHERE name LIKE '%laptop%' scans every row and can't use indexes. Elasticsearch uses an inverted index — the same structure as a book's index — so it finds documents in microseconds regardless of dataset size, and it scores results by relevance automatically.

How Elasticsearch Works

Elasticsearch stores data as JSON documents inside indices (analogous to database tables). Each index is split into shards — smaller Lucene indexes distributed across cluster nodes. This horizontal scaling is what allows Elasticsearch to handle billions of documents.

The key concepts you need to understand:

Index: A collection of documents with a common schema (like a database table)
Document: A single JSON record stored in an index
Mapping: The schema that defines field types (text, keyword, date, integer, etc.)
Shard: A single Lucene instance — indices are divided into shards for distribution
Replica: A copy of a shard for fault tolerance and read performance
Node: A single Elasticsearch server instance
Cluster: One or more nodes working together

Elasticsearch cluster with 3 nodes, 3 primary shards (P0–P2), and 3 replicas

Installation and Setup

Elasticsearch requires Java (JVM) and is available for all major platforms. The easiest way to run it locally is via Docker, which avoids JDK configuration entirely.

Run Elasticsearch with Docker

The fastest way to get started — no JDK setup required, isolated from your system.

Terminal — Docker

# Pull and run Elasticsearch 8.x (single-node dev mode)
docker run -d \
  --name elasticsearch \
  -p 9200:9200 \
  -e "discovery.type=single-node" \
  -e "xpack.security.enabled=false" \
  docker.elastic.co/elasticsearch/elasticsearch:8.12.0

# Verify it's running
curl http://localhost:9200

# Expected output:
# {
#   "name" : "...",
#   "cluster_name" : "docker-cluster",
#   "version" : { "number" : "8.12.0", ... }
# }

Or Install Natively (Ubuntu/Debian)

For production servers, native installation integrates with systemd for automatic restarts.

Terminal — Ubuntu

# Import GPG key
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elasticsearch-keyring.gpg

# Add repository
echo "deb [signed-by=/usr/share/keyrings/elasticsearch-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list

# Install
sudo apt update && sudo apt install elasticsearch

# Enable and start
sudo systemctl enable elasticsearch
sudo systemctl start elasticsearch

Security in Production

Elasticsearch 8.x ships with TLS and authentication enabled by default. The xpack.security.enabled=false flag used above is only for local development. In production, always enable security and place Elasticsearch behind a firewall — never expose port 9200 to the public internet.

Designing Your Index Mapping

Mapping is Elasticsearch's equivalent of a database schema. Getting it right upfront matters because most mapping changes require a full reindex — you can't change a field's type after documents are indexed. The two most important field types are:

text: Full-text search — analyzed, tokenized, stemmed. Use for blog content, product descriptions, anything you want to search "inside."
keyword: Exact match — not analyzed. Use for filtering, sorting, aggregations (categories, status, IDs).

HTTP — Create Index with Mapping

PUT /products
{
  "settings": {
    "number_of_shards": 3,
    "number_of_replicas": 1,
    "analysis": {
      "analyzer": {
        "product_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": ["lowercase", "asciifolding", "stop"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "name": {
        "type": "text",
        "analyzer": "product_analyzer",
        "fields": {
          "keyword": { "type": "keyword" }
        }
      },
      "description": { "type": "text", "analyzer": "product_analyzer" },
      "category": { "type": "keyword" },
      "price": { "type": "float" },
      "rating": { "type": "float" },
      "in_stock": { "type": "boolean" },
      "tags": { "type": "keyword" },
      "created_at": { "type": "date" }
    }
  }
}

The name field uses a multi-field mapping — name for full-text search and name.keyword for exact match and sorting. This is a common pattern that avoids having to define two separate fields.

Field Type	Use Case	Analyzable	Aggregatable
`text`	Full-text search (articles, descriptions)	Yes	No
`keyword`	Filtering, facets, exact match	No	Yes
`integer` / `float`	Numeric ranges, sorting, math	No	Yes
`date`	Date ranges, time-based sorting	No	Yes
`boolean`	Binary flags	No	Yes
`nested`	Arrays of objects with independent queries	Depends	Yes

Indexing Documents

Adding documents to Elasticsearch is done via a simple HTTP PUT or POST request. Elasticsearch automatically creates the document ID or you can provide your own (typically the database primary key so you can sync updates).

HTTP — Index, Update, Delete Documents

# Index a document with explicit ID (recommended — use DB primary key)
PUT /products/_doc/1
{
  "name": "MacBook Pro 16-inch M3",
  "description": "Apple's most powerful laptop with M3 Pro chip, 18GB RAM, and 512GB SSD.",
  "category": "Laptops",
  "price": 2499.99,
  "rating": 4.8,
  "in_stock": true,
  "tags": ["apple", "laptop", "m3", "professional"],
  "created_at": "2026-01-15"
}

# Update a specific field (partial update)
POST /products/_update/1
{
  "doc": {
    "price": 2299.99,
    "in_stock": false
  }
}

# Delete a document
DELETE /products/_doc/1

# Bulk indexing (much more efficient for large datasets)
POST /_bulk
{ "index": { "_index": "products", "_id": "2" } }
{ "name": "Dell XPS 15", "category": "Laptops", "price": 1799.99, "rating": 4.5, "in_stock": true, "tags": ["dell", "laptop"] }
{ "index": { "_index": "products", "_id": "3" } }
{ "name": "Sony WH-1000XM5", "category": "Headphones", "price": 349.99, "rating": 4.9, "in_stock": true, "tags": ["sony", "headphones", "noise-cancelling"] }

For large initial imports, always use the Bulk API. Sending documents one at a time creates significant HTTP overhead — bulk requests can be 10–100× faster.

Query DSL: Searching Your Data

Elasticsearch's Query DSL (Domain Specific Language) is a JSON-based query language. The two fundamental concepts are:

Query context: "How well does this document match?" — scores documents by relevance
Filter context: "Does this document match?" — yes/no, no scoring, results are cached

Match Query — Basic Full-Text Search

HTTP — Match and Multi-Match Queries

# Simple match — searches a single field
GET /products/_search
{
  "query": {
    "match": {
      "name": "macbook pro"
    }
  }
}

# Multi-match — search across multiple fields with field boosting
GET /products/_search
{
  "query": {
    "multi_match": {
      "query": "macbook pro",
      "fields": ["name^3", "description", "tags^2"],
      "type": "best_fields",
      "fuzziness": "AUTO"
    }
  },
  "_source": ["name", "price", "rating"],
  "from": 0,
  "size": 10
}
# name^3 means "name" is 3x more important than description
# fuzziness: AUTO handles typos (e.g. "macbok" still finds "MacBook")

Bool Query — Combining Conditions

The bool query is the workhorse of Elasticsearch. It combines other queries with four clauses:

must: Document must match — contributes to score
should: Document may match — boosts score if it does
must_not: Document must NOT match — in filter context (no scoring)
filter: Document must match — in filter context (cached, no scoring)

HTTP — Bool Query with Filters

GET /products/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "multi_match": {
            "query": "laptop",
            "fields": ["name^3", "description"],
            "fuzziness": "AUTO"
          }
        }
      ],
      "filter": [
        { "term": { "in_stock": true } },
        { "term": { "category": "Laptops" } },
        {
          "range": {
            "price": { "gte": 1000, "lte": 3000 }
          }
        },
        {
          "range": {
            "rating": { "gte": 4.0 }
          }
        }
      ],
      "must_not": [
        { "term": { "tags": "refurbished" } }
      ]
    }
  },
  "sort": [
    { "_score": { "order": "desc" } },
    { "rating": { "order": "desc" } }
  ],
  "from": 0,
  "size": 20
}

Notice that the full-text search is in must (scores matter) while category, price, stock, and rating are in filter (binary match, cached). This is the correct pattern — filters are significantly faster because their results are cached by Elasticsearch.

Highlight and Autocomplete

HTTP — Highlighting and Suggest

# Highlight matching terms in results
GET /products/_search
{
  "query": {
    "match": { "description": "M3 chip performance" }
  },
  "highlight": {
    "fields": {
      "description": {
        "pre_tags": ["<mark>"],
        "post_tags": ["</mark>"],
        "number_of_fragments": 3,
        "fragment_size": 150
      }
    }
  }
}

# Completion suggester (autocomplete as-you-type)
# First, add completion field to mapping:
# "name_suggest": { "type": "completion" }

POST /products/_search
{
  "suggest": {
    "product-suggest": {
      "prefix": "mac",
      "completion": {
        "field": "name_suggest",
        "size": 5,
        "skip_duplicates": true
      }
    }
  }
}

Aggregations: Analytics and Facets

Aggregations are Elasticsearch's analytics engine — they let you calculate statistics, build faceted navigation (like e-commerce filters), and generate histograms, all in a single query. Think of them as SQL's GROUP BY on steroids, running in parallel across all shards.

HTTP — Faceted Search with Aggregations

GET /products/_search
{
  "query": {
    "bool": {
      "must": { "match": { "name": "laptop" } },
      "filter": { "term": { "in_stock": true } }
    }
  },
  "aggs": {
    "by_category": {
      "terms": {
        "field": "category",
        "size": 10
      }
    },
    "price_ranges": {
      "range": {
        "field": "price",
        "ranges": [
          { "to": 500, "key": "Under $500" },
          { "from": 500, "to": 1000, "key": "$500 - $1000" },
          { "from": 1000, "to": 2000, "key": "$1000 - $2000" },
          { "from": 2000, "key": "Over $2000" }
        ]
      }
    },
    "avg_rating": {
      "avg": { "field": "rating" }
    },
    "price_stats": {
      "stats": { "field": "price" }
    },
    "top_tags": {
      "terms": { "field": "tags", "size": 20 }
    }
  },
  "size": 10
}

# Response includes:
# hits.hits[] — matched documents
# aggregations.by_category.buckets[] — facet counts per category
# aggregations.price_ranges.buckets[] — count per price range
# aggregations.avg_rating.value — 4.6
# aggregations.price_stats — { min, max, avg, sum, count }

PHP and Laravel Integration

The official Elasticsearch PHP client makes it easy to interact with Elasticsearch from your Laravel or Symfony application. Install it via Composer:

Terminal — Install PHP Client

composer require elasticsearch/elasticsearch

Service Class Pattern

app/Services/SearchService.php

<?php

namespace App\Services;

use Elastic\Elasticsearch\ClientBuilder;
use Elastic\Elasticsearch\Client;

class SearchService
{
    private Client $client;
    private string $index = 'products';

    public function __construct()
    {
        $this->client = ClientBuilder::create()
            ->setHosts([config('services.elasticsearch.host', 'localhost:9200')])
            ->build();
    }

    public function indexProduct(array $product): void
    {
        $this->client->index([
            'index' => $this->index,
            'id'    => $product['id'],
            'body'  => [
                'name'        => $product['name'],
                'description' => $product['description'],
                'category'    => $product['category'],
                'price'       => (float) $product['price'],
                'rating'      => (float) $product['rating'],
                'in_stock'    => (bool) $product['in_stock'],
                'tags'        => $product['tags'] ?? [],
                'created_at'  => $product['created_at'],
            ],
        ]);
    }

    public function deleteProduct(int $id): void
    {
        $this->client->delete([
            'index' => $this->index,
            'id'    => $id,
        ]);
    }

    public function search(
        string $query,
        array $filters = [],
        int $page = 1,
        int $perPage = 20
    ): array {
        $must = [];
        $filterClauses = [];

        if (!empty($query)) {
            $must[] = [
                'multi_match' => [
                    'query'     => $query,
                    'fields'    => ['name^3', 'description', 'tags^2'],
                    'fuzziness' => 'AUTO',
                ],
            ];
        } else {
            $must[] = ['match_all' => new \stdClass()];
        }

        if (!empty($filters['category'])) {
            $filterClauses[] = ['term' => ['category' => $filters['category']]];
        }

        if (isset($filters['in_stock']) && $filters['in_stock']) {
            $filterClauses[] = ['term' => ['in_stock' => true]];
        }

        if (!empty($filters['min_price']) || !empty($filters['max_price'])) {
            $range = [];
            if (!empty($filters['min_price'])) $range['gte'] = (float) $filters['min_price'];
            if (!empty($filters['max_price'])) $range['lte'] = (float) $filters['max_price'];
            $filterClauses[] = ['range' => ['price' => $range]];
        }

        $params = [
            'index' => $this->index,
            'body'  => [
                'query' => [
                    'bool' => [
                        'must'   => $must,
                        'filter' => $filterClauses,
                    ],
                ],
                'aggs' => [
                    'categories'   => ['terms' => ['field' => 'category', 'size' => 20]],
                    'price_ranges' => [
                        'range' => [
                            'field'  => 'price',
                            'ranges' => [
                                ['to' => 500],
                                ['from' => 500, 'to' => 1000],
                                ['from' => 1000, 'to' => 2000],
                                ['from' => 2000],
                            ],
                        ],
                    ],
                    'avg_rating' => ['avg' => ['field' => 'rating']],
                ],
                'highlight' => [
                    'fields' => [
                        'name'        => ['number_of_fragments' => 0],
                        'description' => ['number_of_fragments' => 2, 'fragment_size' => 150],
                    ],
                ],
                'from' => ($page - 1) * $perPage,
                'size' => $perPage,
                'sort' => [
                    ['_score'  => ['order' => 'desc']],
                    ['rating'  => ['order' => 'desc']],
                ],
            ],
        ];

        $response = $this->client->search($params);

        return [
            'total'          => $response['hits']['total']['value'],
            'hits'           => $response['hits']['hits'],
            'aggregations'   => $response['aggregations'] ?? [],
            'took_ms'        => $response['took'],
        ];
    }
}

Keeping Elasticsearch in Sync with MySQL

Elasticsearch should be treated as a read-optimized replica of your primary database, not the source of truth. The most reliable sync pattern uses Laravel's Model Observers to automatically index changes:

app/Observers/ProductObserver.php

<?php

namespace App\Observers;

use App\Models\Product;
use App\Services\SearchService;

class ProductObserver
{
    public function __construct(private SearchService $search) {}

    public function saved(Product $product): void
    {
        $this->search->indexProduct($product->toArray());
    }

    public function deleted(Product $product): void
    {
        $this->search->deleteProduct($product->id);
    }
}

// Register in AppServiceProvider::boot():
// Product::observe(ProductObserver::class);

Initial Bulk Import

For existing data, create an Artisan command that streams records from MySQL and bulk-indexes them into Elasticsearch in batches of 500–1000.

Real-Time Sync via Observer

The Model Observer handles all future creates, updates, and deletes automatically — no extra code needed in your controllers.

Nightly Re-Index (Optional)

A scheduled Laravel command that re-indexes all records catches any edge cases where the observer sync was skipped (e.g., bulk DB updates that bypass Eloquent).

Artisan Bulk Reindex Command

app/Console/Commands/ReindexProducts.php

<?php

namespace App\Console\Commands;

use App\Models\Product;
use App\Services\SearchService;
use Illuminate\Console\Command;

class ReindexProducts extends Command
{
    protected $signature   = 'search:reindex {--fresh : Delete index before reindexing}';
    protected $description = 'Reindex all products in Elasticsearch';

    public function handle(SearchService $search): int
    {
        $total = Product::count();
        $bar   = $this->output->createProgressBar($total);

        Product::query()
            ->with('category')
            ->chunkById(500, function ($products) use ($search, $bar) {
                foreach ($products as $product) {
                    $search->indexProduct($product->toArray());
                    $bar->advance();
                }
            });

        $bar->finish();
        $this->newLine();
        $this->info("Indexed {$total} products successfully.");

        return Command::SUCCESS;
    }
}

Performance Best Practices

Elasticsearch is fast by default, but a few key decisions determine whether you'll get millisecond or second response times at scale:

Use filter context for exact matches: Filters are cached; queries are not. Put category, price range, and boolean filters inside filter, not must.
Avoid deep pagination: from: 10000 is very expensive — Elasticsearch must retrieve and discard 10,000 documents on every shard. Use search_after cursor-based pagination for deep results.
Right-size your shards: Aim for 20–40 GB per shard. Too many small shards creates overhead; too few large shards limits parallelism.
Disable _source for large fields: If you only need IDs from Elasticsearch (fetching full data from MySQL), disable _source or use stored_fields to avoid storing large text twice.
Set refresh_interval to 30s during bulk indexing: The default 1-second refresh is expensive during large imports. Set "refresh_interval": "30s" while loading and restore to "1s" after.
Use aliases for zero-downtime reindexing: Point your application at an alias (products) rather than the index directly. When you reindex, swap the alias atomically to the new index.

Zero-Downtime Reindexing Pattern

Create products_v2, index all data into it, then atomically swap the products alias: POST /_aliases with {"actions": [{"remove": {"index": "products_v1", "alias": "products"}}, {"add": {"index": "products_v2", "alias": "products"}}]}. Your application never sees an interruption.

Conclusion

Elasticsearch transforms search from a slow SQL afterthought into a first-class feature. Here's what you've learned in this guide:

                 Key Takeaways
                Architecture: Elasticsearch uses an inverted index distributed across shards for sub-millisecond full-text search at any scale
Mapping: Choose text for full-text search and keyword for filtering/aggregations — getting this right upfront avoids painful reindexes
Bool Query: Combine must (scored) with filter (cached) clauses for maximum performance
Aggregations: Power faceted navigation — categories, price ranges, ratings — all in a single query
PHP Integration: Use Model Observers to keep Elasticsearch in sync with your primary database automatically
Performance: Use filters over queries for exact matches, avoid deep pagination, and use aliases for zero-downtime reindexing

            

"Elasticsearch is not just a search engine — it's your analytics layer, your autocomplete engine, and your log aggregator all in one. Master the Query DSL and you've unlocked one of the most powerful tools in the backend developer's toolkit."

The patterns in this guide — the index mapping, the bool query structure, the PHP service class, and the Observer-based sync — are battle-tested in production at scale. Start with a single node locally, validate your mapping and queries, then scale horizontally as your data grows. Elasticsearch handles the distribution automatically.

Elasticsearch Search Database Full-text Search Query DSL PHP Laravel Backend

Mayur Dabhi

Full Stack Developer with 5+ years of experience building scalable web applications with Laravel, React, and Node.js.