Installation: ============
Use pip install hyperloglog
to install from PyPI.
Usage: =====
import hyperloglog
hll = hyperloglog.HyperLogLog(0.01) # accept 1% counting error
hll.add("hello")
print len(hll) # 1
hll.add("hello")
print len(hll) # 1 as items aren't added more than once
hll.add("hello again")
print len(hll) # 2
If we add a further 1000 random strings (giving a total of 1002 strings) we'll have a count roughly within 1% of the true value, in this case it counts 1007 (within +/- 10.2 of the true value)
# add 1000 random 30 char strings to hll
import random
import string
[hll.add("".join([string.ascii_letters[random.randint(0, len(string.ascii_letters)-1)] for n in range(30)])) for m in range(1000)]
print len(hll) # 1007
- Added Sliding window HLL version
- Added bias correction from HLL++