# Hashtables

StahlOS contains an implementation of hashtables based on Google's Swiss table hashtables, using as reference "Designing a Fast, Efficient, Cache-friendly Hash Table, Step by Step". It also implements the "cache-line-aware groups" optimization described at 47:25 of "Designing a Fast, Efficient, Cache-friendly Hash Table, Step by Step," since it only supports 8-byte entries.

It's implemented in AArch64 assembly and follows the typical Procedure Call Standard, so it can be tested (and used) in userspace. The implementation (in `kernel/src/common/hashtable/table.s`

) is hopefully quite readable, so it should be treated as part of this documentation as well, and is the primary reference for most of the algorithmic details of the hashtables.

## Collisions

The hashtables index hash groups (kinda sorta like buckets) with the low bits of the hash, and use the high 8 bits to find elements within the bucket. The number of hash groups is always a power of two, simplifying the implementation. Resizing occurs after a load factor of 15/16 (93.75%) is reached.

```
>>> def birthday_problem(hash_domain, num_entries):
... """Returns the odds that no two entries in a group of `num_entries`
... share the same random `hash_domain`-bit hash."""
... acc = 1
... for i in range(num_entries):
... acc *= ((hash_domain - i) / hash_domain)
... return acc
>>> def collision_chance(num_entries):
... log2_num_groups = ceil(log2(num_entries / 7 * 16 / 15))
... hash_domain = 2 ** 8 * 2 ** log2_num_groups
... return birthday_problem(hash_domain, num_entries)
>>> plt.title("probability of no collisions")
... plt.gca(xlabel="number of entries", ylabel="probability")
... plt.plot(range(1, 500), [(collision_chance(n), 0.5) for n in range(1, 500)])
... plt.show()
```

Since we do open addressing, the chance of a collision might actually be higher. TODO: Analyze this.

## Bitwise Tricks

### Estimate `log2_initial_group_count`

from Initial Capacity

`hashtable_init`

takes `log2_initial_group_count`

(which is efficient to compute on) rather than an initial capacity, which is what most programmers expect. We provide an efficient approximation (which avoids the cost of a `log`

and a division):

```
>>> def clz(x):
... size = x.size()
... expr = BitVecVal(size, size)
... for i in reversed(range(x.size())):
... j = size - i - 1
... bit = Extract(j, j, x)
... expr = If(bit == 1, i, expr)
... return expr
>>> def estimate_log2_initial_group_count(initial_cap):
... return 64 - clz((initial_cap * 3) >> 4)
>>> def estimate_log2_initial_group_count_correct(initial_cap):
... log2_initial_group_count = estimate_log2_initial_group_count(initial_cap)
... initial_group_count = BitVecVal(1, 64) << log2_initial_group_count
... capacity = initial_group_count * 7 * 15 / 16
... return And(ULE(initial_cap, capacity),
... If(initial_cap < 7, ULT(capacity, initial_cap + 8),
... ULT(capacity, initial_cap * 4)))
>>> n = BitVec("n", 64)
>>> solve(Not(estimate_log2_initial_group_count_correct(n)), ULE(1, n), ULT(n, 2 ** 56))
no solution
```

As Z3 shows (by failing to find a counter-example), for any initial capacity between 1 and 2^{56}, the approximation in `estimate_log2_initial_group_count`

is correct, and not *too* wasteful.

### Find Free Slot

During insertion, we want to be able to find a free slot in the hash group, given the metadata byte.

```
>>> cs = []
>>> def clz(x):
... size = x.size()
... expr = size
... for i in reversed(range(x.size())):
... j = size - i - 1
... bit = Extract(j, j, x)
... expr = If(bit == 1, i, expr)
... return expr
>>> def index(l, n, canary):
... cs.append(n >= 0)
... cs.append(n < len(l))
... expr = canary
... for i in range(len(l)):
... expr = If(i == n, l[i], expr)
... return expr
>>> def find_free_slot(n):
... return 63 - clz(~(n | 0xffffffffffffff01))
>>> def find_free_slot_correct(n):
... p = [Bool(f"p{i}") for i in range(8)]
... cs.append(Not(p[0]))
... for i in range(1, 8):
... cs.append(p[i] == (Extract(i, i, n) == 1))
... cs.append(Extract(63, 8, n) == 0)
... i = Int("i")
... cs.append(i == find_free_slot(n))
... return If(i == 0,
... reduce(And, (Not(p[i]) for i in range(1, 8))),
... Not(index(p, i, False)))
>>> n = BitVec("n", 64)
>>> solve(Not(find_free_slot_correct(n)), *cs)
no solution
```

Since Z3 couldn't find examples of incorrectness, `find_free_slot`

must be a correct implementation.