1️⃣ What is a Bloom Filter?

  • A Bloom Filter is a space-efficient probabilistic data structure
  • Used to check:
    • πŸ‘‰ β€œIs this element present in a set?”
  • It can say:
    • βœ… Definitely NOT present
    • ⚠️ Possibly present (may be wrong)

2️⃣ Why use Bloom Filter? Trade-offs

βœ… Use when:

  • You want to reduce DB calls
  • Memory is limited - don’t require much space
  • Fast lookup is required
  • Approximation is okay
  • Simple to implement

❌ Avoid when:

  • You need exact results
  • Deletion is required (use counting BF instead)

πŸ“Œ Common Use Cases

  • Databases (avoid unnecessary disk lookups) - e.g username checks in 20M records
    • Before hitting DB:
      • Check Bloom Filter
        • If NOT present β†’ skip DB call πŸš€
        • If maybe present β†’ query DB
  • Caching systems
  • Web crawlers (avoid revisiting URLs)
  • Spell checkers
  • Distributed systems (like Apache Cassandra, Redis)

3️⃣ Structure

  • A Bloom Filter consists of:
Bit Array (size = m)
[0, 0, 0, 0, 0, 0, 0, ...]

&&

`k` hash functions

4️⃣ How it Works

➀ Insert Operation

  • For element "cat":
  1. Apply k hash functions:
h1(cat) β†’ 2
h2(cat) β†’ 5
h3(cat) β†’ 8
  1. Set bits at those indices:
Index:  0 1 2 3 4 5 6 7 8
Array: [0 0 1 0 0 1 0 0 1]

➀ Search Operation

  • To check "cat":
    • Check bits at h1, h2, h3
All = 1 β†’ Possibly present βœ…
  • To check "dog":
If ANY bit = 0 β†’ Definitely NOT present ❌

Time Complexity

  • Insert β†’ O(k)
  • Search β†’ O(k)
  • Where:k = number of hash functions

5️⃣ False Positives

πŸ“Œ What is it?

  • Bloom Filter may say:
    • "dog" is present ❌ (but actually not)

πŸ“Œ Why?

  • Multiple elements may map to same bits

6️⃣ Key Properties

PropertyValue
False Negatives❌ Never
False Positivesβœ… Possible
Deletion Support❌ Not supported (basic version)
Space Efficiencyβœ… Very high
Lookup Time⚑ O(k)

7️⃣ Variants

  • Counting Bloom Filter
    • Uses counters instead of bits
    • βœ… Supports deletion
  • Scalable Bloom Filter
    • Grows dynamically

8️⃣ Interview One Liner

β€œBloom Filter is a space-efficient probabilistic data structure used to check membership. It uses multiple hash functions to set bits in a bit array. It guarantees no false negatives but may have false positives. It’s commonly used in systems like caches and databases to avoid expensive lookups.” Bloom Filter = Fast + Memory Efficient + Probabilistic