Recursive n-gram hashing is pairwise independent, at best - Computer Science > DatabasesReport as inadecuate




Recursive n-gram hashing is pairwise independent, at best - Computer Science > Databases - Download this document for free, or read online. Document in PDF available to download.

Abstract: Many applications use sequences of n consecutive symbols (n-grams). Hashingthese n-grams can be a performance bottleneck. For more speed, recursive hashfamilies compute hash values by updating previous values. We prove thatrecursive hash families cannot be more than pairwise independent. While hashingby irreducible polynomials is pairwise independent, our implementations eitherrun in time O(n) or use an exponential amount of memory. As a more scalablealternative, we make hashing by cyclic polynomials pairwise independent byignoring n-1 bits. Experimentally, we show that hashing by cyclic polynomialsis is twice as fast as hashing by irreducible polynomials. We also show thatrandomized Karp-Rabin hash families are not pairwise independent.



Author: Daniel Lemire, Owen Kaser

Source: https://arxiv.org/







Related documents