Perfect(ing) hashing in NetBSD

Jörg Sonnenberger

[email protected]

Tokyo, March 16, 2013

AsiaBSDCon 2013

Overview

What are Perfect Hash Functions (PHF)?
Why should you care?
Characteristics
History of perfect hash functions
Excursion I: Randomized hash functions
Excursion II: Random graphs
CHM
BPZ
nbperf
NetBSD's cdb
The future: route lookup

What are Perfect Hash Functions (PHF)?

Hash functions: map a set of keys to a set of integers
...used to implement associative arrays
Perfect: injective, i.e. distinct keys are mapped to distinct values
A.k.a.: collision-free
Minimal: if the value range is [0..n-1] for n keys

Why should you care?

Algorithmic attacks try to force high collision rates as DoS
Deterministic timing
Compact, even if keys don't fit into memory
It's the S in CS
Because they are used in NetBSD

Characteristics

Does the PHF preserve the key order? I.e. can I decide what value each key gets assigned?
What is the space requirement for the PHF itself?
What is the computational cost of the PHF?
If it is not minimal, how dense are the values?

History of perfect hash functions

Bostic: predecessor of GNU gperf, 1984
Knuth: examples in TAoCP
Pearson: Fast Hashing of Variable-Length Text Strings, 1990
...looks quite a bit like RC4
Common issue: lack of scalability
If they work: very fast, reasonable compact

Excursion I: Randomized hash functions

Families of hash functions derived from a parameter (seed)
Can be used to prevent complexity attacks
Strong property: independent random hash functions behave like random variables

Excursion II: Random graphs

Edges in the graph formed from random variable
Very interesting part of graph theory
Property: cycles in 2-graphs and 3-graphs are rare, if graph is dense enough
2-graphs: |V| / |E| >= 2
3-graphs: |V| / |E| > 1.24

CHM

Found by Czech, Havas, Majewski in 1992
One of the first expected linear run time algorithm
MPHF
Order-preserving
n * c * ceil(ld(n)) bits per key
Compute 2/3 hash values, reduce by modulo, 2/3 table lookups, sum, modulo

CHM illustrated

	Key 0	Key 1	Key 2	Key 3
h0	0	2	3	1
h1	1	3	4	4

BPZ

Found by Botelho, Pagh, Ziviani in 2007
Similar to CHD
Not order-preserving , not minimal
Same computation cost as CHD for PHF
Using 3-graphs: 1.98 bits per key
Post-process with counting filter to obtain MPHF
Including counting filter: 2.79 bits per key for MPHF
Adds two table lookups and one 64bit population count

BPZ illustrated

	Key 0	Key 1	Key 2	Key 3
h0	0	2	3	1
h1	1	3	4	4

CHD

"Hash, displace, compress"
Found by Belazzougui, Botelho, Dietzfelbinger in 2009
1.4 bit per key for PHF
Significantly more complicated than BPZ: no further details

nbperf

Started in 2009 to provide an alternative for GNU gperf for large key sets
cmph didn't fit: license-wise and in terms of code
Provides CHM for 2-graphs and 3-graphs, BPZ for 3-graphs
CHD is too new, will follow later
Options for algorithm, density, stable seeding

nbperf performance

Webster's Second International Dictionary
234977 and 76295 lines

Input	Algorithm	Tries	Run time in s
web2	CHM	1	0.58
	CHM3	39	0.58
	BPZ	11	0.51
web2a	CHM	12	0.35
	CHM3	7	0.17
	BPZ	18	0.16

nbperf output for CHM


#include <stdlib.h>

uint32_t
hash(const void * __restrict key, size_t keylen)
{
  static const uint32_t g[469955] = {
    /* ... */
  };
  uint32_t h[3];

  mi_vector_hash(key, keylen, 0x00000000U, h);

  return (g[h[0] % 469955] + g[h[1] % 469955]) % 234977;
}

Berkeley DB for index databases

No atomic transactions
Result: Copy-update-move schemes
High database size overhead
Size matters for pre-built databases
Complex code
Per-application database caching
Needs locking in multi-threaded programs

SQLite

Supports transactions
Limited support for concurrent read-only access
Still high database size overhead
Even larger code base

The NetBSD constant database

First part: indexed variable size storage
Second part: perfect hash index
No storage of keys: regenerate keys from data
Tiny: 1.6KB reader, 4.3KB writer on AMD64
Databases similar or smaller size than textual source
Lock-free, memory mapped IO
Worst case: 3 block reads for key lookup, 1 block read for offset
NetBSD 6: services, terminfo and device database
Database size: 1/4 of BDB and faster creation

Route lookup in the BSD network stack

Problem with CIDR: find best matching prefix (BMP)
Implementation: radix tree
Very complex code
Solves more generic problem: non-continous netmasks
Kitchen-sink approach in the network stack
Mixes policy and implementation details
David Young started separation in NetBSD with thin wrapper layer

Multi-level hashing

Find BMP using binary search on the possible prefix lengths
Needs markers if a longer prefix exists and the search starts at a shorter prefix
References to the covered prefix in the marker avoid backtracking

Multi-level hashing example

Destination network	Gateway
0/0	192.168.0.10
127/8	127.0.0.1
192.168/16	R: 0/0
192.168.0/24	192.168.0.2
192.168.1/24	192.168.0.1
10/8	192.168.0.1
10.0/16	R: 10/8
10.0.10/24	192.168.0.5
10.192/12	192.168.0.6
11/8	R: 10/8
11.192/12	192.168.0.7

192.168.0.5: 192.168/16 -> 192.168.0/16 done
10.0.9.8: 10.0/16 -> 10.0.9/24 done

Multi-level performance

Perfect hashing: deterministic cost (memory accesses!)
BPZ with 2-graph and no post-filter: 50% usage, 2 bit per key
Separate tables for larger prefixes
IPv4: 64bit per entry (32bit network, 32bit next-hop ID)
IPv6: from 64bit to 160bit
Fits into L3 cache even with 100,000 entries
Question: construction time
Dynamic branching: less than 2 lookups on average

Summary

Modern algorithms like CHM, BPZ and CHD are fast and practical
Useful for performance sensitive read-mostly data structures
Useful for compact on-disk databases
Available in NetBSD 6

Q&A

Questions?