2. Elasticsearch

What to Say in HLD

To optimize search, I’ll use Elasticsearch as a dedicated search engine.
Product/event/user data is stored in primary/relational DB will remains source of truth.
Data is asynchronously indexed into Elasticsearch through Kafka workers.
Search API queries Elasticsearch for low latency and relevance ranking.

Important Tradeoff

Elasticsearch is usually eventually consistent. Data may appear after few seconds delay after DB write. Say this in interview. Very valuable point

If Interviewer Asks Why Not SQL?

Say:

SQL handles exact match queries well, but search relevance, typo tolerance, stemming, and scalable text search are better handled by Elasticsearch.

Interview Gold Lines

Use DB for transactions, Elasticsearch for search.
Never use Elasticsearch as primary booking/payment DB.
Great for read-heavy search systems.
Supports horizontal scaling.
Near real-time indexing.

1️⃣ What is Elasticsearch?

Elasticsearch is a distributed search and analytics engine used for:

Full-text search
Filtering
Autocomplete
Ranking results
Real-time analytics

Built on Apache Lucene.

Use it when SQL LIKE '%word%' becomes slow at scale.

2️⃣ When to Use in System Design Interview

Use Elasticsearch when interviewer asks:

Design product search (Amazon, Flipkart)
Ride/location search (Uber nearby drivers)
Job search portal
Social media post search
Ticket/movie/event search
Log analytics system
Any system needing fast keyword search

3️⃣ Why Not Just SQL?

SQL databases are great for transactions, but search is limited.

Example:

SELECT * FROM productsWHERE name LIKE '%iphone%'

Problems:

Slow on millions of rows
Poor typo handling
No ranking relevance
Weak autocomplete
Hard stemming (running = run)

Elasticsearch solves these.

3️⃣ Architecture in System Design

Searching:

User Search Query
      ↓
 API Gateway
      ↓
 Search Service
      ↓
 Elasticsearch Cluster
      ↓
 Return ranked results

Write path:

User creates product/job/post
      ↓
 Main DB saves data
      ↓
 Kafka / Async Worker
      ↓
 Sync data to Elasticsearch

4️⃣ Why Async Sync?

Do not directly write to DB + Elasticsearch in request path.

Use:

Kafka
RabbitMQ
Background worker
CDC pipeline

Reason:

Better reliability
Retry on failure
Loose coupling
Better latency

5️⃣ Core Concepts to Speak in Interview

Index

Like a database.

Example:

products_index
jobs_index
users_index

Document

JSON record inside index.

{
  "id": 101,
  "name": "iPhone 15",
  "brand": "Apple",
  "price": 80000
}

Shards

Large index split into pieces across servers. Use for horizontal scaling.

Replicas

Copy of shard. Used for:

High availability
Faster reads

Inverted Index data stracture (Most Important)

Stores:

word → list of document ids

Example:

iphone → [1,4,8]
apple  → [1,2,9]

This makes search fast. 1. Inverted Index

6️⃣ Query Features

You can say Elasticsearch supports:

Full text search
Filtering
Sorting
Facets
Pagination
Highlighting
Autocomplete
Geo search
Fuzzy search (typo tolerance)

7️⃣ Example in Interview

E-commerce Search

User searches:

iphone under 80k

Elasticsearch handles:

Keyword search = iphone
Filter = price < 80000
Sort = popularity
Ranking relevance

8️⃣ Real System Design Usage

Example: BookMyShow / Ticket Search

Use DB for:

bookings
payments
seats

Use Elasticsearch for:

search movies
city wise events
theatre lookup
filter by language/date

9️⃣ Scaling Story

If traffic increases:

Add more Elasticsearch nodes
Increase shards
Add replicas
Cache hot queries in Redis

Om's Brain

Explorer

What to Say in HLD

Important Tradeoff

If Interviewer Asks Why Not SQL?

Interview Gold Lines

1️⃣ What is Elasticsearch?

2️⃣ When to Use in System Design Interview

3️⃣ Why Not Just SQL?

3️⃣ Architecture in System Design

4️⃣ Why Async Sync?

5️⃣ Core Concepts to Speak in Interview

Index

Document

Shards

Replicas

Inverted Index data stracture (Most Important)

6️⃣ Query Features

7️⃣ Example in Interview

E-commerce Search

8️⃣ Real System Design Usage

Example: BookMyShow / Ticket Search

9️⃣ Scaling Story

Table of Contents

Mindmap

Graph View

Backlinks