Leveling up Clojure’s Hash Maps
TLDR: We implemented the Lean Hash-Array Mapped Trie (HAMT) in ClojureScript (see here: https://github.com/bendyworks/lean-map) and have seen performance improvements of 2x in iteration, 2–4x in hashing the map, and an order of magnitude in equality checking, for the worst case, and in the best case it’s a speed up of two orders of magnitude. What really excites us is that the resulting implementation is simpler and uses fewer lines of code.
A new paper on HAMTs by Michael J. Steindorfer and Jurgen J. Vinju shows how to make HAMT implementations “lean”, meaning they consume less memory, and “efficient”, meaning speedier performance. The paper claims to make iteration and equality checks about 80–100% faster. As we have previously stated, the improvement is even better for ClojureScript. Since Clojure and ClojureScript share the same HAMT implementation, Clojure should see comparable performance gains.
The 20,000 foot view
Before we get into the key ideas of the paper a high level overview of HAMTs is needed.
The HAMT is possibly the best current implementation of an immutable HashMap. It is very performant, within 3–5x of a mutable HashMap’s performance, and only requires that keys be hashable, rather than sortable.
The HAMT is the default HashMap for Clojure and Scala, has been ported to Haskell, Ruby, and JavaScript.
We are using Clojure’s HAMT for comparison to the new Lean HAMT implementation.
A HAMT is a tree of nodes. Each node contains a pair of key-values (e.g. [key, value]
, ["name", "fred"]
) or references to sub-nodes which hold more key-value pairs. The node also contains meta-data about the contents of the array so a key can be located in the array. The node’s structure looks like this:
Node {
arrayMetadata,
Object[]
}
The node’s array will typically look like this:
["name", "mary", null, <sub-node 1>, "likes", "rockets", "lives", "on earth", null, <sub-node 2>, …]
Both the key-value pairs and sub-node links take up two slots in the array, with the value of null
in a key slot being a marker that the next slot is a sub-node.
Key-value pairs and sub-nodes should be grouped together
Looking at the above node you’ll see that we have some space wasted by using the null
s as markers for sub-node links. It would be better if we could have this instead:
["name", "mary", "likes", "rockets", "lives", "on earth", <sub-node 1>, <sub-node 2>, …]
In this hypothetical scenario, we get rid of the null
s that only exist to mark sub-nodes, and thus simplify scanning the array. The idea is to have two meta-data buckets in each node, one to track the key-value pairs and another to track the sub-nodes. Here is the new node structure:
Node {
keyValueMetadata,
nodeMetadata,
Object[]
}
The grouping of key-value pairs and sub-nodes allows us to do a direct linear scan for iteration, as opposed to the depth-first search, which is required in the current implementation. The result is simpler, and 2x faster, iteration for our HAMTs.
Deletions should return a canonical compact representation
Ideally, HAMTs that are equal should have the same internal representation. This is not the case with deletion, though. Performing a deletion on an HAMT can alter the structure such that two HAMTs with the same data could have different internal representations.
The potential spurious difference can occur when deleting a a key-value pair on a node with exactly one key-value pair and one sub-node. The key-value pair is removed but the sub-node still exists. So now the node only contains a superfluous sub-node, in effect making the HAMT “bloated.” An extra hop is needed to get to the other nodes. It’s worth noting that this is a state that is impossible to get from pure insertions.
Here’s what we mean:
["name", "mary", <sub-node 1>] -> [<sub-node 1>]
The Lean HAMT solution to this problem is to pull any data from immediately lower-level sub-nodes into the present node. This lifting propagates down this section of the trie.
Here’s the Lean HAMT delete operation:
["name", "mary", <sub-node 1>] -> [sub-node key, sub-node value, …]
The result is a canonical compact representation of the Lean HAMT for every operation on it. In addition to saving a little bit of memory, this allows a big improvement in the speed of equality checks. For the best case, in which two Lean HAMTs share nodes, we can do comparisons on nodes rather than having to iterate through both HAMTs. The resulting check is O(log n)
instead of O(n log n)
, and we see a performance improvement of two orders of magnitude compared to Clojure’s HAMT. In the worst case, a comparison between Lean HAMTs that share no nodes, we get an order of magnitude speed up because we only need to compare the nodes’ arrays, not two sequences.
Conclusion
This implementation shows that the Steindorfer and Vinju paper’s claims are replicable in the wild, and lowers the barriers of entry for hacking on HAMTs. We have confidence that other languages migrating to the Lean HAMT implementation will see similar gains. The Clojure dev team is currently looking at porting Lean HAMTs to Clojure.
Our post has focused on implementation details and performance gains. For more background on the conceptual framework behind Lean HAMTs, we invite you to RTFP.