bitly.excalidraw


1️⃣ NFR - CAP Theorem Decision

Need Strong Consistency?

Required for Banking, Ticket Booking, and Trading Systems.

  • Reason: Every read must see the latest write.

Bitly?

Not necessary. Scenario: Create URLImmediately open URL → Fails for a few seconds. This is acceptable.

Decision

  • CP ❌ / AP ✅
  • Choose: High Availability, Eventual Consistency

2️⃣ 301 vs 302 Redirect

Aspect301 Permanent Redirect302 Temporary Redirect
MeaningRedirect is permanentRedirect is temporary
Browser CachingClient/browser cache the redirectnot cached
Future RequestsMay bypass Bitly after cachingAlways reach Bitly first
Server TrafficLowerHigher
Infrastructure CostLowerHigher
Analytics Tracking❌ Limited / Lost after caching✅ Full tracking
Click Counting❌ Not reliable✅ Accurate
Monitoring & Observability❌ Harder✅ Easier
A/B Testing Support❌ Difficult✅ Easy
Common Use CasePermanent URL migrationURL shorteners, tracking links
Bitly Interview Choice❌ Usually not preferred✅ Preferred

3️⃣ Deep Dive #1 - Generating Unique Short Codes

Option 1: URL Prefix

Take first few characters.

  • Example: https://google.comhttps:
  • Problem: Many URLs share same prefix (e.g., https://twitter.com, https://facebook.com), leading to a high collision rate.
  • Decision: ❌ Reject

Option 2: Random Number + Base62

Generate a random number and convert to Base62 (characters 0-9, A-Z, a-z, total 62 characters).

  • Why Base62? A 6-character code yields 62^6 ≈ 56 Billion combinations.
  • Problem: Birthday Paradox. Even with 56 billion combinations, storing 1 Billion URLs can still produce collisions.
  • Solution (Check-and-Put): Generate CodeCheck DBCollision? → If yes, Generate Again.

Option 3: Hash Long URL

  • SHA256(url)Base62 Encode → Take first 6 chars.
  • Problem: Still has collisions.
  • Solution: Same fix: check DB first.

Use a counter (1, 2, 3, 4...) and convert using Base62.

  • Example: 11, 10A, 61z, 6210.
  • Benefits: Fast, unique, no collision checks.
  • Drawback: Predictable IDs. Competitors can guess links (e.g., bit.ly/1, bit.ly/2, bit.ly/3) and scrape them.
  • Solution: Use a bijective function with obfuscation: CounterObfuscationBase62 (e.g., using libraries like Sqids to produce random-looking codes like fG8xT2 or kL9pQ1).

4️⃣ Deep Dive #2 - Reducing Redirect Latency

Step 1: Database Indexing

Without index, querying 1 Billion rows requires a slow Full Scan.

  • Solution: Set PRIMARY KEY(short_url).
  • The database builds a B-Tree Index, reducing lookup time to O(log N).

Step 2: Redis Cache

Flow: ClientServerRedisDatabase

Cache Structure

  • Key: Short URL (e.g., abc123)
  • Value: Original URL (e.g., https://google.com)

Read-Through Cache Strategy

  • Check Redis.
  • If Hit: Return immediately.
  • If Miss: Read DB → Update Cache → Return result.
  • Eviction Policy: LRU (Least Recently Used) to automatically remove old/inactive URLs.

Step 3: CDN (Optional)

Store redirects at edge locations (e.g., China UserChina CDN instead of going to California Server).

  • Pros: Faster global latency.
  • Cons: Lose analytics visibility.

5️⃣ Deep Dive #3 - Scaling

Estimate Traffic

  • Given: 100 Million DAU
  • Approx: 10^8 users
  • Seconds/day: 10^5
  • RPS: 10^8 / 10^5 ≈ 10^3 ≈ 1000 RPS
  • Peak: multiply by 10 & 100 10k - 100k RPS

Scale Application Layer

  • Vertical Scaling: Bigger server (More CPU/Memory).
  • Horizontal Scaling (Preferred): ClientLoad BalancerServer 1/2/3.

Split Services

Instead of one Monolith, use: API GatewayRead Service & Write Service.

  • Reason: Reads >> Writes (redirect traffic dominates).

Auto Scaling

Scale out automatically when CPU > 75% or Memory > 75%.


6️⃣ Scaling Counter Service

  • Problem: Multiple Write Servers maintaining their own counter leads to collisions (e.g., Server A -> 10, Server B -> 10).
  • Solution: Global Counter (Write ServiceRedis Counter via INCR). Since Redis is single-threaded, it guarantees unique sequential IDs.
  • Optimization (Batching): Allocate batches of IDs to servers (e.g., Server A gets 1-1000, Server B gets 1001-2000) to reduce Redis calls. As we can afford to loose some count as total allowed is 62^6 ≈ 56 Billion ≈ 10^10

7️⃣ Database Scaling

  • Storage Estimate: ~500 bytes per row * 1 Billion URLs = 10^11 bytes/10^6 byte per gb= 500 GB.
  • Verdict: A single DB is enough since modern SSDs support multiple TBs.
  • Sharding (If needed): Shard by short_url (e.g., hash(short_url) % N) to route requests to the correct shard.

8️⃣ High Availability

  • Redis Cache Failure: Non-critical. Cache miss leads to DB query which rebuilds the cache.
  • Global Counter Failure: Critical. Requires replication, snapshots, and failover.
  • Database Failure: Use a Primary DBReplica DB setup with hourly snapshots to S3. Recovery is done via replica promotion or restoring a snapshot.

Final Architecture

                    +----------------+
                    |  API Gateway   |
                    +----------------+
                       /         \
                      /           \
                     v             v
           +---------------+ +---------------+
           | Read Service  | | Write Service |
           +---------------+ +---------------+
                  |                 |
                  |                 |
                  v                 v
             +---------+      +------------+
             | Redis   |      | Global     |
             | Cache   |      | Counter    |
             +---------+      +------------+
                  |                 |
                  +--------+--------+
                           |
                           v
                    +-------------+
                    | PostgreSQL  |
                    +-------------+
                           |
                           v
                    Replica + Backup

Interview Summary

Functional Requirements

✅ Create short URL
✅ Redirect users
✅ Custom Alias
✅ Expiration Time

Non-Functional Requirements

✅ Low Latency (<200ms)
✅ Scalability (100M DAU)
✅ Unique Short Codes
✅ High Availability
✅ Eventual Consistency

Key Technologies

  • PostgreSQL
  • Redis Cache
  • Redis Global Counter (INCR)
  • Base62 Encoding
  • B-Tree Index
  • API Gateway
  • Horizontal Scaling
  • Read/Write Service Separation
  • Database Replication
  • Snapshots/Backups