Skip to content

A simple implementation of a Bloom filter, a space-efficient probabilistic data structure, in Rust.

Notifications You must be signed in to change notification settings

alexanderbez/rust-bloom

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

rsbloom   Build Status Latest Version Rustc Version License

A simple implementation of a Bloom filter, a space-efficient probabilistic data structure.

Bloom Filters

A Bloom filter is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set. It allows for queries to return: "possibly in set" or "definitely not in set". Elements can be added to the set, but not removed; the more elements that are added to the set, the larger the probability of false positives. It has been shown that fewer than 10 bits per element are required for a 1% false positive probability, independent of the size or number of elements in the set.

The provided implementation allows you to create a Bloom filter specifying the approximate number of items expected to inserted and an optional false positive probability. It also allows you to approximate the total number of items in the filter.

Enhanced Double Hashing

Enhanced double hashing is used to set bit positions within a bit vector. The choice for double hashing was shown to be effective without any loss in the asymptotic false positive probability, leading to less computation and potentially less need for randomness in practice by Adam Kirsch and Michael Mitzenmacher in Less Hashing, Same Performance: Building a Better Bloom Filter.

The enhanced double hash takes the form of the following formula:

gi(x) = (H1(x) + iH2(x) + f(i)) mod m, where

H1 is Murmur3 128-bit, H2 is xxHash 64-bit, and f(i) = i3

Usage

Add the rsbloom dependency to your Cargo.toml:

[dependencies]
rsbloom = "0.1.0"

Example

use rsbloom::BloomFilter;

fn main() {
  let approx_items = 100;
  let mut bf = BloomFilter::new(approx_items);

  bf.set(&"foo");
  bf.set(&"bar");

  bf.has(&"foo"); // true
  bf.has(&"bar"); // true
  bf.has(&"baz"); // false

  bf.num_items_approx(); // 2
}

Tests

make test

Benchmarks

make bench

About

A simple implementation of a Bloom filter, a space-efficient probabilistic data structure, in Rust.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published