1️⃣ What is Elasticsearch?

Elasticsearch is a distributed search and analytics engine used for:

  • Full-text search
  • Filtering
  • Autocomplete
  • Ranking results
  • Real-time analytics

Built on Apache Lucene.

Use it when SQL LIKE '%word%' becomes slow at scale.


2️⃣ When to Use in System Design Interview

Use Elasticsearch when interviewer asks:

  • Design product search (Amazon, Flipkart)
  • Ride/location search (Uber nearby drivers)
  • Job search portal
  • Social media post search
  • Ticket/movie/event search
  • Log analytics system
  • Any system needing fast keyword search

3️⃣ Why Not Just SQL?

SQL databases are great for transactions, but search is limited.

Example:

SELECT * FROM productsWHERE name LIKE '%iphone%'

Problems:

  • Slow on millions of rows
  • Poor typo handling
  • No ranking relevance
  • Weak autocomplete
  • Hard stemming (running = run)

Elasticsearch solves these.


4️⃣ How to Mention in Interview

Say:

Primary DB (MySQL/Postgres) stores source of truth.
Elasticsearch is used as a secondary search index for fast querying.


5️⃣ Architecture in System Design

Searching:

User Search Query
      ↓
 API Gateway
      ↓
 Search Service
      ↓
 Elasticsearch Cluster
      ↓
 Return ranked results

Write path:

User creates product/job/post
      ↓
 Main DB saves data
      ↓
 Kafka / Async Worker
      ↓
 Sync data to Elasticsearch

6️⃣ Why Async Sync?

Do not directly write to DB + Elasticsearch in request path.

Use:

  • Kafka
  • RabbitMQ
  • Background worker
  • CDC pipeline

Reason:

  • Better reliability
  • Retry on failure
  • Loose coupling
  • Better latency

7️⃣ Core Concepts to Speak in Interview

Index

Like a database.

Example:

  • products_index
  • jobs_index
  • users_index

Document

JSON record inside index.

{
  "id": 101,
  "name": "iPhone 15",
  "brand": "Apple",
  "price": 80000
}

Shards

Large index split into pieces across servers.

Use for horizontal scaling.


Replicas

Copy of shard.

Used for:

  • High availability
  • Faster reads

Inverted Index data stracture (Most Important)

Stores:

word → list of document ids

Example:

iphone → [1,4,8]
apple  → [1,2,9]

This makes search fast. 1. Inverted Index


8️⃣ Query Features

You can say Elasticsearch supports:

  • Full text search
  • Filtering
  • Sorting
  • Facets
  • Pagination
  • Highlighting
  • Autocomplete
  • Geo search
  • Fuzzy search (typo tolerance)

9️⃣ Example in Interview

User searches:

iphone under 80k

Elasticsearch handles:

  • Keyword search = iphone
  • Filter = price < 80000
  • Sort = popularity
  • Ranking relevance

🔟 Real System Design Usage

Use DB for:

  • bookings
  • payments
  • seats

Use Elasticsearch for:

  • search movies
  • city wise events
  • theatre lookup
  • filter by language/date

1️⃣1️⃣ Scaling Story

If traffic increases:

  • Add more Elasticsearch nodes
  • Increase shards
  • Add replicas
  • Cache hot queries in Redis

1️⃣2️⃣ Important Tradeoff

Elasticsearch is usually eventually consistent.

Data may appear after few seconds delay after DB write.

Say this in interview. Very valuable point.


1️⃣3️⃣ What to Say in HLD

For search-heavy workloads, I’ll use Elasticsearch as a dedicated search engine.
Main relational DB remains source of truth.
Data is asynchronously indexed into Elasticsearch through Kafka workers.
Search API queries Elasticsearch for low latency and relevance ranking.


1️⃣4️⃣ Interview Gold Lines

  • Use DB for transactions, Elasticsearch for search.
  • Never use Elasticsearch as primary booking/payment DB.
  • Great for read-heavy search systems.
  • Supports horizontal scaling.
  • Near real-time indexing.

1️⃣5️⃣ Quick 30 sec Answer

To optimize search, I’d introduce Elasticsearch. Product/event/user data is stored in primary DB, then asynchronously indexed into Elasticsearch. Search requests hit Elasticsearch, which gives fast full-text search, filtering, typo tolerance, and ranked results.


1️⃣6️⃣ If Interviewer Asks Why Not SQL?

Say:

SQL handles exact match queries well, but search relevance, typo tolerance, stemming, and scalable text search are better handled by Elasticsearch.