In interview

When to Bring Up Caching

  • Read-heavy workload
100M users * 20 reads/user/day
= 2B reads/day
  • Expensive queries
Generate Social media Feed

Posts + Likes + Followers + Ranking
 
"Feed generation is expensive, so I'll cache computed results."
  • High database CPU
  • Latency requirements Requirement:P99 < 100ms - > 99% of requests must finish in less than 100 milliseconds. “To meet latency targets, frequently accessed data will be served from cache.”

How to Introduce Caching

  1. Identify the bottleneck - What is slow? ? what is cache key ?
  2. Decide what to cache - Profile Feed Trending posts
  3. Choose your cache architecture - Cache Aside
  4. Set an eviction policy - I’ll use LRU with 5-minute TTL
  5. Address the downsides - Consistency, Hot Keys

1️⃣ What is Caching?

  • Store frequently accessed data in a faster storage layer.
  • Reduces:
    • Database load
    • Latency
    • Expensive recomputation
  • Usually uses:
    • RAM (Redis, Memcached)
    • CDN
    • Local memory

📌 Memory (RAM) access is ~10,000x faster than disk (SSD)

Cache-diagram.excalidraw


2️⃣ Why Use Cache?

Use caching when:

  • System is read-heavy
  • Queries are expensive
  • Low latency is required
  • Database CPU/load is high

Example:

  • Newsfeed generation
  • User profiles
  • Trending posts
  • API responses

3️⃣ Types of Caching

TypeDescriptionUse Case
External CacheSeparate cache server (Redis)Most common
In-Process CacheInside app memoryUltra low latency
CDN CacheEdge servers near usersImages/videos/static data
Client Cache (less imp)Browser/mobile cacheOffline support

4️⃣ Cache Eviction Policies

Which Item stay in cache which should be removed policy

PolicyMeaning
LRU (common )Remove least recently used
LFURemove least frequently used
FIFORemove oldest item
TTL(more common)Expire after fixed time

5️⃣ Common Issues / Deep Dives

A. Cache Stampede / Thundering Herd

When a popular/hot cache entry expires, many requests miss at the same time and all hit the database together → DB overload 💥

✅ Solutions:

  • Request coalescing / single flight ⭐ - Only ONE request rebuilds cache.
  • Cache warming - `proactively refresh the hot cache keys Refresh cache before expiry 55sec
  • Serve Stale Data

Interview One-Liner ⭐

“To prevent cache stampede, I would use cache warming + single-flight locking + TTL jitter.”

Cache-Issue.excalidraw


B. Cache Consistency

Cache and Database contain different values → users see stale (old) data.

✅ Solutions:

  • Cache Invalidation ⭐ Update DB -> delete Cache
  • Short TTL
  • Eventual consistency - Accept temporary stale data.

Interview One-Liner ⭐

“To maintain cache consistency, I would use cache invalidation on write + TTL, while accepting eventual consistency where freshness isn’t critical.”


C. Hot Keys

One cache key gets way more traffic than all others → a single cache server becomes overloaded.

Example:

  • Celebrity profile profile:taylor_swift
  • Trending post
Traffic

A → 95% 🔥 // A has taylor swift entry
B → 3%
C → 2%

✅ Solutions:

  • Replicate hot keys on all cache replica nodes
  • Local in-memory cache
  • Load balancing across cache servers

Interview One-Liner ⭐

“To solve hot keys, I would replicate popular cache entries across multiple cache nodes and optionally use local in-memory caching.”


7️⃣ Common Tools

ToolPurpose
RedisDistributed cache
MemcachedSimple cache
CloudflareCDN
Amazon CloudFrontCDN

8️⃣ What to Mention in Interviews

Always Explain:

  1. Why cache is needed
  2. What data is cached
  3. Cache key
  4. TTL / eviction policy
  5. Read/write flow
  6. Consistency handling
  7. Stampede handling

9️⃣ Interview Default Recommendation

Default Stack

  • Cache Type → External cache
  • Tool → Redis
  • Pattern → Cache Aside
  • Eviction → LRU + TTL

This works for most interviews. 🚀


Quick Revision

ConceptDefault Choice
Cache ServerRedis
PatternCache Aside
EvictionLRU + TTL
Main GoalFaster reads
Biggest ProblemsStale data, stampede, hot keys