1️⃣ What is Elasticsearch?
Elasticsearch is a distributed search and analytics engine used for:
- Full-text search
- Filtering
- Autocomplete
- Ranking results
- Real-time analytics
Built on Apache Lucene.
Use it when SQL LIKE '%word%' becomes slow at scale.
2️⃣ When to Use in System Design Interview
Use Elasticsearch when interviewer asks:
- Design product search (Amazon, Flipkart)
- Ride/location search (Uber nearby drivers)
- Job search portal
- Social media post search
- Ticket/movie/event search
- Log analytics system
- Any system needing fast keyword search
3️⃣ Why Not Just SQL?
SQL databases are great for transactions, but search is limited.
Example:
SELECT * FROM productsWHERE name LIKE '%iphone%'
Problems:
- Slow on millions of rows
- Poor typo handling
- No ranking relevance
- Weak autocomplete
- Hard stemming (
running=run)
Elasticsearch solves these.
4️⃣ How to Mention in Interview
Say:
Primary DB (MySQL/Postgres) stores source of truth.
Elasticsearch is used as a secondary search index for fast querying.
5️⃣ Architecture in System Design
Searching:
User Search Query
↓
API Gateway
↓
Search Service
↓
Elasticsearch Cluster
↓
Return ranked results
Write path:
User creates product/job/post
↓
Main DB saves data
↓
Kafka / Async Worker
↓
Sync data to Elasticsearch
6️⃣ Why Async Sync?
Do not directly write to DB + Elasticsearch in request path.
Use:
- Kafka
- RabbitMQ
- Background worker
- CDC pipeline
Reason:
- Better reliability
- Retry on failure
- Loose coupling
- Better latency
7️⃣ Core Concepts to Speak in Interview
Index
Like a database.
Example:
products_indexjobs_indexusers_index
Document
JSON record inside index.
{
"id": 101,
"name": "iPhone 15",
"brand": "Apple",
"price": 80000
}
Shards
Large index split into pieces across servers.
Use for horizontal scaling.
Replicas
Copy of shard.
Used for:
- High availability
- Faster reads
Inverted Index data stracture (Most Important)
Stores:
word → list of document ids
Example:
iphone → [1,4,8]
apple → [1,2,9]
This makes search fast. 1. Inverted Index
8️⃣ Query Features
You can say Elasticsearch supports:
- Full text search
- Filtering
- Sorting
- Facets
- Pagination
- Highlighting
- Autocomplete
- Geo search
- Fuzzy search (typo tolerance)
9️⃣ Example in Interview
E-commerce Search
User searches:
iphone under 80k
Elasticsearch handles:
- Keyword search = iphone
- Filter = price < 80000
- Sort = popularity
- Ranking relevance
🔟 Real System Design Usage
Example: BookMyShow / Ticket Search
Use DB for:
- bookings
- payments
- seats
Use Elasticsearch for:
- search movies
- city wise events
- theatre lookup
- filter by language/date
1️⃣1️⃣ Scaling Story
If traffic increases:
- Add more Elasticsearch nodes
- Increase shards
- Add replicas
- Cache hot queries in Redis
1️⃣2️⃣ Important Tradeoff
Elasticsearch is usually eventually consistent.
Data may appear after few seconds delay after DB write.
Say this in interview. Very valuable point.
1️⃣3️⃣ What to Say in HLD
For search-heavy workloads, I’ll use Elasticsearch as a dedicated search engine.
Main relational DB remains source of truth.
Data is asynchronously indexed into Elasticsearch through Kafka workers.
Search API queries Elasticsearch for low latency and relevance ranking.
1️⃣4️⃣ Interview Gold Lines
- Use DB for transactions, Elasticsearch for search.
- Never use Elasticsearch as primary booking/payment DB.
- Great for read-heavy search systems.
- Supports horizontal scaling.
- Near real-time indexing.
1️⃣5️⃣ Quick 30 sec Answer
To optimize search, I’d introduce Elasticsearch. Product/event/user data is stored in primary DB, then asynchronously indexed into Elasticsearch. Search requests hit Elasticsearch, which gives fast full-text search, filtering, typo tolerance, and ranked results.
1️⃣6️⃣ If Interviewer Asks Why Not SQL?
Say:
SQL handles exact match queries well, but search relevance, typo tolerance, stemming, and scalable text search are better handled by Elasticsearch.