org.edg.data.util
Class BloomFilter

java.lang.Object
  |
  +--org.edg.data.util.BloomFilter
All Implemented Interfaces:
java.io.Serializable
Direct Known Subclasses:
CountingBloomFilter, SimpleBloomFilter

public abstract class BloomFilter
extends java.lang.Object
implements java.io.Serializable

A Representation of a Bloom Filter. See The Math of Bloom Filters for how to tune the Filter. Note you must specify the size of the filter a priori (i.e. set m), but it can tell you when it has too many entries, and will start to degrade (i.e. more than n entries. We use MD5 to compute the hash functions, which are fixed to 4.

Version:
$ Id:$
Author:
James Casey
See Also:
Serialized Form

Field Summary
protected  byte[] m_bits
          the actual bits of the filter
 
Constructor Summary
BloomFilter(byte[] bits, int entries, int numBits, int optimalNumKeys)
          Create a new BloomFilter from its constituent parts
BloomFilter(int bitsPerEntry, int optimalNumKeys)
          create a new bloom filter
 
Method Summary
 void add(java.lang.String entry)
          Add an entry to the Filter
protected abstract  void clearBit(int bit)
           
 boolean contains(java.lang.String entry)
           
 void delete(java.lang.String entry)
           
 boolean equals(java.lang.Object o)
           
 int getNumberOfHashes()
           
 int getNumBits()
           
 int getNumEntries()
           
 int getOptimalNumKeys()
           
 int hashCode()
           
 boolean isFull()
           
protected abstract  boolean isSet(int bit)
           
protected abstract  void setBit(int bit)
           
 java.lang.String toString()
           
 void writeBits(java.io.OutputStream out)
           
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Field Detail

m_bits

protected final byte[] m_bits
the actual bits of the filter

Constructor Detail

BloomFilter

public BloomFilter(int bitsPerEntry,
                   int optimalNumKeys)
create a new bloom filter

Parameters:
bitsPerEntry - The number of bits per entry in the filter we require (also known as m/n)
optimalNumKeys - The number of keys for which this filter optimally support (also known as n)

BloomFilter

public BloomFilter(byte[] bits,
                   int entries,
                   int numBits,
                   int optimalNumKeys)
Create a new BloomFilter from its constituent parts

Parameters:
bits - the array of bits for this filter
entries - the number of keys stored in the filter
numBits - the actual number of bits used
optimalNumKeys - the optimal number of keys to store in the filter
Method Detail

add

public void add(java.lang.String entry)
Add an entry to the Filter

Parameters:
entry - the entry. If this is null no change is made to the filter

delete

public void delete(java.lang.String entry)

contains

public boolean contains(java.lang.String entry)

isFull

public boolean isFull()

writeBits

public void writeBits(java.io.OutputStream out)
               throws java.io.IOException
java.io.IOException

getOptimalNumKeys

public int getOptimalNumKeys()

getNumEntries

public int getNumEntries()

getNumberOfHashes

public int getNumberOfHashes()

getNumBits

public int getNumBits()

toString

public java.lang.String toString()
Overrides:
toString in class java.lang.Object

equals

public boolean equals(java.lang.Object o)
Overrides:
equals in class java.lang.Object

hashCode

public int hashCode()
Overrides:
hashCode in class java.lang.Object

setBit

protected abstract void setBit(int bit)

clearBit

protected abstract void clearBit(int bit)

isSet

protected abstract boolean isSet(int bit)