1️⃣ What is the Problem?

Imagine you have a URL Shortener / Cache / Database.

Initially only 1 server:

S1

As users grow, you add more servers:

S1   S2   S3

Now question:

How do we decide which server stores a user/key/data?

2️⃣ Normal Hashing (Simple Hash + Mod)

We use:

server = hash(key) % N

Where:

key = user ID / cache key / request ID
N = number of servers

Example:

User ID = 25

25 % 3 = 1

Store in:

S1

Works fine.

3️⃣ Problem with Normal Hashing

Now add one more server.

Before:

25 % 3 = 1 → S1

After adding S4:

25 % 4 = 1 → S1

Maybe okay for this key, but many keys change.

Example:

User ID = 14

Before:

14 % 3 = 2 → S2

After:

14 % 4 = 2 → S2

Try another:

User ID = 10

Before:

10 % 3 = 1 → S1

After:

10 % 4 = 2 → S2

Now data moves.

Big issue:

When adding/removing server → many keys move

That means:

heavy redistribution
more network traffic
downtime risk
poor scalability

4️⃣ Consistent Hashing Idea

Instead of hash(key) % N

We create a circular ring.

Servers are placed on ring.

Keys are also placed on ring.

Then assign key to nearest clockwise server.

Simple.

5️⃣ Easy Example

Assume ring:

0 → 100

Place servers:

S1 = 20
S2 = 50
S3 = 80

Ring:

        S1(20)
 
   S3(80)      S2(50)

(Circular)

6️⃣ Add Key

Suppose key = 35

Now move clockwise.

After 35, next server = 50

So:

35 → S2

Suppose key = 70

Clockwise next server = 80

So:

70 → S3

Suppose key = 90

Move clockwise.

Wrap around.

Next server = 20

So:

90 → S1

This wrap-around is very important.

7️⃣ Add New Server

Now add:

S4 = 65

Ring:

20(S1) → 50(S2) → 65(S4) → 80(S3)

Who moves?

Only keys between:

50 → 65

Earlier → S3
Now → S4

Only small part moves.

Good scalability ✅

8️⃣ Remove Server

Suppose S2 fails.

Earlier:

Keys between:

20 → 50

went to S2.

Now nearest clockwise server = S4 or S3.

Only S2’s keys move.

Not whole system.

Very efficient.

9️⃣ Why Better Than Modulo Hashing?

Normal hashing:

Add 1 server → almost all keys may change ❌

Consistent hashing:

Add/remove 1 server → only nearby keys move ✅

This is the biggest interview point.

🔟 Virtual Nodes (Very Important)

Problem:

If each server appears only once:

S1   S2   S3

Load may become uneven.

One server can get too much traffic.

Fix → Virtual Nodes

Same server placed multiple times.

Example:

S1 -> 10, 40, 70
S2 -> 20, 50, 90
S3 -> 30, 60, 80

Now data spreads more evenly.

Benefits:

better load balancing
less hotspot
fault tolerance
smoother scaling

Most interviewers like hearing virtual nodes.

1️⃣1️⃣ Where Used?

Use in distributed systems like:

Distributed Cache
Distributed DB
Sharding
Load Balancer
CDN
Message Queue

Common examples:

Redis
Apache Cassandra

1️⃣2️⃣ Interview Answer (Best Short Answer)

If interviewer asks “What is consistent hashing?”

Say:

Consistent hashing is a data distribution technique where both servers and keys are mapped on a circular ring. A key is assigned to the nearest clockwise server. When a server is added or removed, only a small subset of keys are remapped, which improves scalability and fault tolerance. Virtual nodes are used for better load balancing.

1️⃣3️⃣ Quick Comparison

Feature	Modulo Hashing	Consistent Hashing
Add server	Many keys move	Few keys move
Remove server	Many keys move	Few keys move
Rebalancing	High	Low
Scaling	Poor	Good
Load balancing	Weak	Better (VNodes)

1️⃣4️⃣ Easy Memory Trick 🧠

Remember:

Modulo Hashing → reshuffle almost everything
Consistent Hashing → move only nearby keys
Virtual Nodes → distribute load evenly

Om's Brain

Explorer

4. Consistent hasing

1️⃣ What is the Problem?

2️⃣ Normal Hashing (Simple Hash + Mod)

3️⃣ Problem with Normal Hashing

Big issue:

4️⃣ Consistent Hashing Idea

5️⃣ Easy Example

6️⃣ Add Key

7️⃣ Add New Server

8️⃣ Remove Server

9️⃣ Why Better Than Modulo Hashing?

🔟 Virtual Nodes (Very Important)

Fix → Virtual Nodes

1️⃣1️⃣ Where Used?

1️⃣2️⃣ Interview Answer (Best Short Answer)

1️⃣3️⃣ Quick Comparison

1️⃣4️⃣ Easy Memory Trick 🧠

Table of Contents

Mindmap

Graph View