Skip to content

Latest commit

 

History

History
21 lines (17 loc) · 1.05 KB

68-bloom-filter.md

File metadata and controls

21 lines (17 loc) · 1.05 KB
layout title date comments categories language references
post
Bloom Filter
2018-10-09 12:39
true
system design, data structures
en

A Bloom filter is a data structure used to detect whether an element is in a set in a time and space efficient way.

False positive matches are possible, but false negatives are not – in other words, a query returns either "possibly in set" or "definitely not in set". Elements can be added to the set, but not removed (though this can be addressed with a "counting" bloom filter); the more elements that are added to the set, the larger the probability of false positives.

Usecases

  • Cassandra uses Bloom filters to determine whether an SSTable has data for a particular row.
  • An HBase Bloom Filter is an efficient mechanism to test whether a StoreFile contains a specific row or row-col cell.
  • A website's anti-fraud system can use bloom filters to reject banned users effectively.
  • The Google Chrome web browser used to use a Bloom filter to identify malicious URLs.