In interview
When to Bring Up Caching
- Read-heavy workload
100M users * 20 reads/user/day
= 2B reads/day- Expensive queries
Generate Social media Feed
↓
Posts + Likes + Followers + Ranking
"Feed generation is expensive, so I'll cache computed results."- High database CPU
- Latency requirements
Requirement:
P99 < 100ms- > 99% of requests must finish in less than 100 milliseconds. “To meet latency targets, frequently accessed data will be served from cache.”
How to Introduce Caching
- Identify the bottleneck - What is slow? ? what is cache key ?
- Decide what to cache - Profile Feed Trending posts
- Choose your cache architecture - Cache Aside
- Set an eviction policy - I’ll use LRU with 5-minute TTL
- Address the downsides - Consistency, Hot Keys
1️⃣ What is Caching?
- Store frequently accessed data in a faster storage layer.
- Reduces:
- Database load
- Latency
- Expensive recomputation
- Usually uses:
- RAM (
Redis,Memcached) - CDN
- Local memory
- RAM (
📌 Memory (RAM) access is ~10,000x faster than disk (SSD)
Cache-diagram.excalidraw
2️⃣ Why Use Cache?
Use caching when:
- System is read-heavy
- Queries are expensive
- Low latency is required
- Database CPU/load is high
Example:
- Newsfeed generation
- User profiles
- Trending posts
- API responses
3️⃣ Types of Caching
| Type | Description | Use Case |
|---|---|---|
| External Cache | Separate cache server (Redis) | Most common |
| In-Process Cache | Inside app memory | Ultra low latency |
| CDN Cache | Edge servers near users | Images/videos/static data |
| Client Cache (less imp) | Browser/mobile cache | Offline support |
4️⃣ Cache Eviction Policies
Which Item stay in cache which should be removed policy
| Policy | Meaning |
|---|---|
LRU (common ) | Remove least recently used |
LFU | Remove least frequently used |
FIFO | Remove oldest item |
TTL(more common) | Expire after fixed time |
5️⃣ Common Issues / Deep Dives
A. Cache Stampede / Thundering Herd
When a popular/hot cache entry expires, many requests miss at the same time and all hit the database together → DB overload 💥
✅ Solutions:
- Request coalescing / single flight ⭐ - Only ONE request rebuilds cache.
- Cache warming - `proactively refresh the hot cache keys Refresh cache before expiry 55sec
- Serve Stale Data
Interview One-Liner ⭐
“To prevent cache stampede, I would use cache warming + single-flight locking + TTL jitter.”
Cache-Issue.excalidraw
B. Cache Consistency
Cache and Database contain different values → users see stale (old) data.
✅ Solutions:
- Cache Invalidation ⭐
Update DB -> delete Cache - Short TTL
- Eventual consistency - Accept temporary stale data.
Interview One-Liner ⭐
“To maintain cache consistency, I would use cache invalidation on write + TTL, while accepting eventual consistency where freshness isn’t critical.”
C. Hot Keys
One cache key gets way more traffic than all others → a single cache server becomes overloaded.
Example:
- Celebrity profile
profile:taylor_swift - Trending post
Traffic
↓
A → 95% 🔥 // A has taylor swift entry
B → 3%
C → 2%✅ Solutions:
- Replicate hot keys on all cache replica nodes
- Local in-memory cache
- Load balancing across cache servers
Interview One-Liner ⭐
“To solve hot keys, I would replicate popular cache entries across multiple cache nodes and optionally use local in-memory caching.”
7️⃣ Common Tools
| Tool | Purpose |
|---|---|
| Redis | Distributed cache |
| Memcached | Simple cache |
| Cloudflare | CDN |
| Amazon CloudFront | CDN |
8️⃣ What to Mention in Interviews
Always Explain:
- Why cache is needed
- What data is cached
- Cache key
- TTL / eviction policy
- Read/write flow
- Consistency handling
- Stampede handling
9️⃣ Interview Default Recommendation
Default Stack
- Cache Type → External cache
- Tool →
Redis - Pattern → Cache Aside
- Eviction →
LRU + TTL
This works for most interviews. 🚀
Quick Revision
| Concept | Default Choice |
|---|---|
| Cache Server | Redis |
| Pattern | Cache Aside |
| Eviction | LRU + TTL |
| Main Goal | Faster reads |
| Biggest Problems | Stale data, stampede, hot keys |