Bloom & Cuckoo Filters

Insert items, watch hash functions map to bits or buckets, and discover when false positives strike.

How Bloom filters work

A Bloom filter is a bit array of m bits, paired with k independent hash functions. To insert an item, you hash it with each of the k functions and set the corresponding bits to 1. To check membership, you hash again and check if all k bits are set. If any bit is 0, the item is definitely not in the set. If all bits are 1, the item is probably in the set — but it might be a false positive.

Why false positives happen

As more items are inserted, more bits get set. Eventually, for a given non-member, all k of its hash positions might coincidentally already be set by other insertions. The theoretical false positive rate is (1 - e^-kn/m)^k, where n is the number of items. Add enough items above and watch the rate climb — then try looking up items you never inserted.

Why you can't delete

Unsetting a bit would break things. If two items share a hash position (which is expected), clearing that bit when you "delete" one item would also remove the other. There's no way to know how many items map to each bit, so deletion is off the table.

Where you'll find them

Bloom filters are everywhere: LSM-tree storage engines (LevelDB, RocksDB, Cassandra) use them to skip unnecessary disk reads. CDNs use them to avoid caching one-hit-wonder URLs. Spell checkers, safe browsing lists, Bitcoin SPV nodes checking if a transaction matches a filter — any time you need a fast "definitely not" answer, a Bloom filter is a good fit.

Bloom & Cuckoo Filters

How Bloom filters work

Why false positives happen

Why you can't delete

Where you'll find them

How Cuckoo filters work

The cuckoo hashing trick

Why deletion works

Trade-offs vs. Bloom

Where you'll find them