Approximate counting with a floating-point counter - Computer Science > Data Structures and Algorithms

Abstract: Memory becomes a limiting factor in contemporary applications, such asanalyses of the Webgraph and molecular sequences, when many objects need to becounted simultaneously. Robert Morris Communications of the ACM, 21:840-842,1978 proposed a probabilistic technique for approximate counting that isextremely space-efficient. The basic idea is to increment a counter containingthe value $X$ with probability $2^{-X}$. As a result, the counter contains anapproximation of $\lg n$ after $n$ probabilistic updates stored in $\lg\lg n$bits. Here we revisit the original idea of Morris, and introduce a binaryfloating-point counter that uses a $d$-bit significand in conjunction with abinary exponent. The counter yields a simple formula for an unbiased estimationof $n$ with a standard deviation of about $0.6\cdot n2^{-d-2}$, and uses$d+\lg\lg n$ bits.We analyze the floating-point counter-s performance in a general frameworkthat applies to any probabilistic counter, and derive practical formulas toassess its accuracy.

Author: Miklos Csuros

Source: https://arxiv.org/