Mapping Every Key, Part 2
I'm on a journey to make a total map data structure in Haskell. In Part 1, I surveyed common data structures for partial maps and considered their applicability to total maps, ultimately deciding on lazily-generated tries. This post, Part 2, will be dedicated to the specification and formally deriving an implementation. Part 3 will produce a Haskell implementation.
Why So Serious?
Formal specification is widely considered a heavyweight methodology. Program derivation even more so. Also, as you'll see, the resulting implementation is simple enough that surely I could have jumped straight to code without all this ivory tower nonsense. What's the point? I want to demonstrate a process:
- Find the abstraction.
- Make a specification.
- Calculate an implementation.
The only reason for it all is that I want to use it as a relatively simple example.
The Goal
Here is a first pass at a Haskell signature of what I'll implement, to give a sense of where I'm trying to go, but what I end up actually doing will look quite different.
data Trie a b
(!) :: Trie a b -> a -> b
trie :: (a -> b) -> Trie a b
adjust :: a -> (b -> b) -> Trie a b -> Trie a bFind the Abstraction
Finding the right abstraction is the most tragically overlooked part of design. I think it's a misunderstood idea. Software engineers usually talk of abstraction as though it is a mechanism for hiding internal complexity, leading one to think it has something to do with indirection or the relationship between interface and implementation. In fact, abstraction is something that our brains do so naturally, so fundamentally to how we think, that it can be difficult to describe informally. Its purpose is to be able to reason about complicated things as though they are simple things. However, abstraction is not merely a less intimidating veneer hiding complex machinery. Abstraction is what it means.
In the case of total maps, the answer comes easily, so there is unfortunately no opportunity to demonstrate going through a few false starts before finding the right abstraction. A trie is a total map. A trie from \(A\) to \(B\) denotes a function \(A \to B\). That's it!
Make a Specification
I like to make my specifications denotational. I've always found the Wikipedia page on denotational semantics unhelpful, so here is a brief explanation in my own words. Denotational semantics assigns a precise mathematical meaning, called a denotation or a model, to each construct in a language or interface, via a semantic function that maps syntax to its meaning. The denotation is the abstraction. It says what something is, independently of how it's implemented.
First, I will decide on a notation for the type of a trie. We'll say that \(A \twoheadrightarrow B\) is a trie with keys of type \(A\) and values of type \(B\).
$$
\_ \twoheadrightarrow \_ : \mathrm{Type} \times \mathrm{Type} \to \mathrm{Type}
$$
In the previous section, we already decided what the denotation of a trie is. It's a function. The (!) :: Trie a b -> a -> b function from the Haskell signature earlier is a good candidate for the semantic function, because it's exactly the denotation. It translates a Trie a b into a function a -> b. It is conventional to use notation like \(\left[\!\left[ \_ \right]\!\right]\) for the semantic function, and I will do the same here. However, the exact semantics will also depend on the key type, so I'll add the key type as a suffix.
$$
\left[\!\left[ \_ \right]\!\right]_A : (A \twoheadrightarrow B) \to (A \to B)
$$
The semantic function defines a precise analogy. You can read it as "a trie is like a function." We have not yet specified how a trie is like a function, that is, how the representation of a trie should be interpreted. We can't do that until we have defined the representation.
Now that we have the semantic function, we can write specifications for the remaining functions. We follow the principle of compositionality, which is that the meaning of an expression is defined in terms of the meanings of its subexpressions.
The \(\mathrm{trie}\) function is an inverse of the semantic function, so its specification should leave \(f\) unchanged.
$$
\begin{align}
\mathrm{trie}_A &: (A \to B) \to (A \twoheadrightarrow B)
\\
\left[\!\left[ \mathrm{trie}(f) \right]\!\right] &= f
\end{align}
$$
The \(\mathrm{adjust}\) function updates the existing function output for the given key. Make sure to note how the meaning of \(\mathrm{adjust}\) is defined in terms of the meaning of the trie being adjusted.
$$
\begin{align}
\mathrm{adjust}_A &: A \times (B \to B) \times (A \twoheadrightarrow B) \to (A \twoheadrightarrow B)
\\
\left[\!\left[ \mathrm{adjust}(a,h,t) \right]\!\right] &= \lambda a'.\; \begin{cases}
h(\left[\!\left[ t \right]\!\right](a')), & a' = a \\
\left[\!\left[ t \right]\!\right](a'), & \text{otherwise}
\end{cases}
\end{align}
$$
Find a Representation
It's not really going to be the case that we find the representation for a trie, because it depends on the key type. Let's instead find representations for common types, in particular the building blocks of algebraic data types.
We know that, however we end up representing them, \(A \twoheadrightarrow B\) must be isomorphic to \(A \to B\), via \(\left[\!\left[ \_ \right]\!\right]_A\) and \(\mathrm{trie}_A\). Whatever the representation of tries with keys of type \(A\) is, it must be isomorphic to functions with domain \(A\). If we represent tries as algebraic data types, we can use this isomorphism as a guide. In particular, two (finite) algebraic data types are isomorphic iff they have the same cardinalities, according to these rules:
$$
\begin{align}
| \_ | &: \mathrm{Type} \to \mathbb{N}
\\
| \mathbf{0} | &= 0
\\
| \mathbf{1} | &= 1
\\
| A + B | &= | A | + | B |
\\
| A \times B | &= | A | \times | B |
\\
| A \to B | &= {| B |}^{| A |}
\end{align}
$$
I'm using the notation \(\mathbf{0}\) and \(\mathbf{1}\) to mean the empty type and unit type, respectively. \(A + B\), a sum type, is the disjoint union of \(A\) and \(B\). \(A \times B\), a product type, is the cartesian product of \(A\) and \(B\). \(A \to B\), a function type, is also sometimes called an exponent type and written \(B^A\). I choose to use the more familiar arrow notation because that's what more people are used to seeing, but I will sometimes call them exponent types.
The upshot is that we can interpret a type as a natural number, use laws of arithmetic to manipulate it, then reinterpret the result as a different type. The only question now is what form we want tries to have. The whole point of the trie is that adjusting its values should result in direct changes to its representation, not in wrapper functions, and since we can't make changes to the body of a lambda, we should avoid exponents (\(\_ \to \_\)) in the representation entirely. Also, there is no point using sums (\(\_ + \_ \)) in the representation of a trie, because that would imply that some tries have structure that other tries of the same type do not, which is something we might expect for partial maps, but not for total maps. Also, the \(\mathrm{trie}\) function should always be able to return a trie, so it should be impossible for a trie type to have no values; this rules out the empty type, \(\mathbf{0}\). All that remains, then, are unit (\(\mathbf{1}\)), products (\(\_ \times \_\)), and abstract types (\(A\), \(B\), etc.). We'll also allow tries whose keys are abstract types and tries whose keys are tries, since they will be constructed according to the rules we've set for ourselves anyway. Let's codify the restrictions to be precise:
$$
\newcommand{\nonterm}[1]{\langle \textit{#1} \rangle} % nonterminal
\newcommand{\bnfdef}{::=}
\newcommand{\bnfor}{\mid}
\begin{aligned}
\nonterm{trie} &\bnfdef \mathbf{1}
\\
&\bnfor \nonterm{trie} \times \nonterm{trie}
\\
&\bnfor \nonterm{index} \twoheadrightarrow \nonterm{trie}
\\
&\bnfor X
\\
\\
\nonterm{index} &\bnfdef \nonterm{index} \twoheadrightarrow \nonterm{index}
\\
&\bnfor A
\\
&\bnfor B
\end{aligned}
$$
\(X\) is distinguished from \(A\) and \(B\) in that \(X\) will never be an input type while \(A\) and \(B\) will never be output types.
Let's begin! The general strategy is:
- Start with the target isomorphism between tries and functions.
- Manipulate until the right hand side (the function side) such that it satisfies the above grammar.
- Turn the isomorphism into an equality. This is justified only because we're defining the representation to be any type that is isomorphic to the original function type and that satisfies the grammar. As long as these conditions are met, we are free to interpret the isomorphism as the definition.
Empty
$$
\begin{align}
\mathbf{0} \twoheadrightarrow X & \simeq \mathbf{0} \to X
\\
| \mathbf{0} \twoheadrightarrow X | & = | \mathbf{0} \to X |
\\
| \mathbf{0} \twoheadrightarrow X | & = {| X |}^{| \mathbf{0} |}
\\
| \mathbf{0} \twoheadrightarrow X | & = {| X |}^0
\\
| \mathbf{0} \twoheadrightarrow X | & = 1
\\
| \mathbf{0} \twoheadrightarrow X | & = | \mathbf{1} |
\\
\mathbf{0} \twoheadrightarrow X & \simeq \mathbf{1}
\\
\mathbf{0} \twoheadrightarrow X & = \mathbf{1}
\end{align}
$$
It makes sense that a trie whose key type has no values would not contain any output values, since there would never be any way to look them up.
Unit
$$
\begin{align}
\mathbf{1} \twoheadrightarrow X & \simeq \mathbf{1} \to X
\\
| \mathbf{1} \twoheadrightarrow X | & = | \mathbf{1} \to X |
\\
| \mathbf{1} \twoheadrightarrow X | & = {| X |}^{| \mathbf{1} |}
\\
| \mathbf{1} \twoheadrightarrow X | & = {| X |}^1
\\
| \mathbf{1} \twoheadrightarrow X | & = | X |
\\
\mathbf{1} \twoheadrightarrow X & \simeq X
\\
\mathbf{1} \twoheadrightarrow X & = X
\end{align}
$$
Since there is exactly one possible input, we expect the trie to be just the single possible output.
Sums
$$
\begin{align}
A + B \twoheadrightarrow X & \simeq A + B \to X
\\
| A + B \twoheadrightarrow X | & = | A + B \to X |
\\
| A + B \twoheadrightarrow X | & = {| X |}^{| A + B |}
\\
| A + B \twoheadrightarrow X | & = {| X |}^{| A | + | B |}
\\
| A + B \twoheadrightarrow X | & = {| X |}^{| A |} \times {| X |}^{| B |}
\\
| A + B \twoheadrightarrow X | & = | A \to X | \times | B \to X |
\\
| A + B \twoheadrightarrow X | & = | (A \to X) \times (B \to X) |
\\
A + B \twoheadrightarrow X & \simeq (A \to X) \times (B \to X)
\\
A + B \twoheadrightarrow X & \simeq (A \twoheadrightarrow X) \times (B \twoheadrightarrow X)
\\
A + B \twoheadrightarrow X & = (A \twoheadrightarrow X) \times (B \twoheadrightarrow X)
\end{align}
$$
A trie whose keys have a sum type is a pair of tries, one for each type in the sum. This makes sense because it means we have a place to look for outputs given each possible input.
Products
$$
\begin{align}
A \times B \twoheadrightarrow X & \simeq A \times B \to X
\\
| A \times B \twoheadrightarrow X | & = | A \times B \to X |
\\
| A \times B \twoheadrightarrow X | & = {| X |}^{| A \times B |}
\\
| A \times B \twoheadrightarrow X | & = {| X |}^{| A | \times | B |}
\\
| A \times B \twoheadrightarrow X | & = {| X |}^{| B | \times | A |}
\\
| A \times B \twoheadrightarrow X | & = {({| X |}^{| B |})}^{| A |}
\\
| A \times B \twoheadrightarrow X | & = | A \to {| X |}^{| B |} |
\\
| A \times B \twoheadrightarrow X | & = | A \to (B \to X) |
\\
| A \times B \twoheadrightarrow X | & = | A \to (B \to X) |
\\
A \times B \twoheadrightarrow X & \simeq A \to (B \to X)
\\
A \times B \twoheadrightarrow X & \simeq A \twoheadrightarrow (B \twoheadrightarrow X)
\\
A \times B \twoheadrightarrow X & = A \twoheadrightarrow (B \twoheadrightarrow X)
\end{align}
$$
A trie whose keys have a product type is a nested trie. This makes sense, because we expect the depth of a trie to be proportional to the size of the key.
Exponents
Yes, it even makes sense for the key of a trie to be a function! Here, it's simpler to just use the specified isomorphism between tries and functions to manipulate the type, instead of manipulating via the types' cardinalities.
$$
\begin{align}
(A \to B) \twoheadrightarrow X & \simeq (A \to B) \to X
\\
(A \to B) \twoheadrightarrow X & \simeq (A \twoheadrightarrow B) \twoheadrightarrow X
\\
(A \to B) \twoheadrightarrow X & = (A \twoheadrightarrow B) \twoheadrightarrow X
\end{align}
$$
A trie whose keys are functions is just a trie whose keys are tries. This works because we know that all of the ways we can represent a trie are valid types of keys for a trie to have. Study the BNF from earlier if this is not immediately making sense.
All Together
We have arrived at a pleasantly simple family of trie representations.
$$
\begin{align}
\mathbf{0} \twoheadrightarrow X & = \mathbf{1}
\\
\mathbf{1} \twoheadrightarrow X & = X
\\
A + B \twoheadrightarrow X & = (A \twoheadrightarrow X) \times (B \twoheadrightarrow X)
\\
A \times B \twoheadrightarrow X & = A \twoheadrightarrow (B \twoheadrightarrow X)
\\
(A \to B) \twoheadrightarrow X & = (A \twoheadrightarrow B) \twoheadrightarrow X
\end{align}
$$
Define the Semantic Function
Now that we have some concrete representations, we can define their corresponding semantic functions. It doesn't matter how we define it, as long as it is invertible, because the \(\mathrm{trie}\) function is its inverse. I won't bother making a bespoke proof that it's invertible. The derivation of the \(\mathrm{trie}\) function later will be sufficient proof.
$$
\begin{align}
{\left[\!\left[ \_ \right]\!\right]}_A &: (A \twoheadrightarrow X) \to (A \to X)
\\
\\
{\left[\!\left[ \_ \right]\!\right]}_{\mathbf{0}} &: \mathbf{1} \to (\mathbf{0} \to X)
\\
{\left[\!\left[ \star \right]\!\right]}_{\mathbf{0}} &= \text{absurd}
\\
\\
{\left[\!\left[ \_ \right]\!\right]}_{\mathbf{1}} &: X \to (\mathbf{1} \to X)
\\
{\left[\!\left[ x \right]\!\right]}_{\mathbf{1}} &= \lambda \star.\; x
\\
\\
{\left[\!\left[ \_ \right]\!\right]}_{A + B} &: (A \twoheadrightarrow X) \times (B \twoheadrightarrow X) \to (A + B \to X)
\\
{\left[\!\left[ (t_1,t_2) \right]\!\right]}_{A + B} &= \lambda
\begin{cases}
\mathrm{inj}_1(a) \mapsto {\left[\!\left[ t_1 \right]\!\right]}_A(a) \\
\mathrm{inj}_2(b) \mapsto {\left[\!\left[ t_2 \right]\!\right]}_B(b)
\end{cases}
\\
\\
{\left[\!\left[ \_ \right]\!\right]}_{A \times B} &: (A \twoheadrightarrow (B \twoheadrightarrow X)) \to (A \times B \to X)
\\
{\left[\!\left[ t \right]\!\right]}_{A \times B} &= \lambda (a,b).\; {\left[\!\left[ {\left[\!\left[ t \right]\!\right]}_{A}(a) \right]\!\right]}_{B}(b)
\\
\\
{\left[\!\left[ \_ \right]\!\right]}_{A \to B} &: ((A \twoheadrightarrow B) \twoheadrightarrow X) \to ((A \to B) \to X)
\\
{\left[\!\left[ t \right]\!\right]}_{A \to B} &= \lambda f.\; {\left[\!\left[ t \right]\!\right]}_{A \twoheadrightarrow B}(\mathrm{trie}(f))
\end{align}
$$
For exponents, the semantic function depends on the \(\mathrm{trie}\) function. The specification of \(\mathrm{trie}\) depends on the semantic function, which will also affect how we derive the implementation of \(\mathrm{trie}\) later. The mutual recursion is well-founded because it implements structural recursion on the key type, that is, the key type shrinks with the recursion.
Derive Function Implementations
At this point, we have everything we need to derive everything else. That is, we have:
- interface: the API being provided
- model: what the values denote
- specification: what the operations mean, in terms of the model
- representation: how the values are constructible in code
- semantic function: how the representation relates to the denotation
All that's left is to derive \(\mathrm{trie}\) and \(\mathrm{adjust}\).
\(\mathrm{trie}\)
To refresh your memory, here is the specification for the \(\mathrm{trie}\) function:
$$
\begin{align}
{\mathrm{trie}}_A &: (A \to X) \to (A \twoheadrightarrow X)
\\
{\left[\!\left[ {\mathrm{trie}}_A(f) \right]\!\right]}_A &= f
\end{align}
$$
Now that we have concrete representations and a proper definition of \(\left[\!\left[ \_ \right]\!\right]\), it's not terribly difficult to derive \(\mathrm{trie}\).
$$
\begin{align}
{\mathrm{trie}}_{\mathbf{0}} &: (\mathbf{0} \to X) \to \mathbf{1}
\\
{\left[\!\left[ {\mathrm{trie}}_{\mathbf{0}}(\text{absurd}) \right]\!\right]}_{\mathbf{0}} &= \text{absurd}
\\
{\left[\!\left[ {\mathrm{trie}}_{\mathbf{0}}(\text{absurd}) \right]\!\right]}_{\mathbf{0}} &= {\left[\!\left[ \star \right]\!\right]}_{\mathbf{0}}
\\
{\mathrm{trie}}_{\mathbf{0}}(\text{absurd}) &= \star
\\
\\
{\mathrm{trie}}_{\mathbf{1}} &: (\mathbf{1} \to X) \to X
\\
{\left[\!\left[ \mathrm{trie}_{\mathbf{1}}(f) \right]\!\right]}_{\mathbf{1}} &= f
\\
\lambda \star.\; \mathrm{trie}_{\mathbf{1}}(f) &= f
\\
\lambda \star.\; \mathrm{trie}_{\mathbf{1}}(f) &= \lambda \star.\; f(\star)
\\
\mathrm{trie}_{\mathbf{1}}(f) &= f(\star)
\\
\\
{\mathrm{trie}}_{A + B} &: (A + B \to X) \to (A \twoheadrightarrow X) \times (B \twoheadrightarrow X)
\\
{\left[\!\left[ {\mathrm{trie}}_{A + B}(f) \right]\!\right]}_{A + B} &= f
\\
{\left[\!\left[ (\mathrm{proj}_1({\mathrm{trie}}_{A + B}(f)), \mathrm{proj}_2({\mathrm{trie}}_{A + B}(f))) \right]\!\right]}_{A + B} &= f
\\
\lambda
\begin{cases}
\mathrm{inj}_1(a) \mapsto {\left[\!\left[ \mathrm{proj}_1({\mathrm{trie}}_{A + B}(f)) \right]\!\right]}_A(a) \\
\mathrm{inj}_2(b) \mapsto {\left[\!\left[ \mathrm{proj}_2({\mathrm{trie}}_{A + B}(f)) \right]\!\right]}_B(b)
\end{cases} &= f
\\
({\left[\!\left[ \mathrm{proj}_1({\mathrm{trie}}_{A + B}(f)) \right]\!\right]}_A, {\left[\!\left[ \mathrm{proj}_2({\mathrm{trie}}_{A + B}(f)) \right]\!\right]}_B) &= (f \circ \mathrm{inj}_1, f \circ \mathrm{inj}_2)
\\
(\mathrm{proj}_1({\mathrm{trie}}_{A + B}(f)), \mathrm{proj}_2({\mathrm{trie}}_{A + B}(f))) &= (\mathrm{trie}_A(f \circ \mathrm{inj}_1), \mathrm{trie}_B(f \circ \mathrm{inj}_2))
\\
{\mathrm{trie}}_{A + B}(f) &= (\mathrm{trie}_A(f \circ \mathrm{inj}_1), \mathrm{trie}_B(f \circ \mathrm{inj}_2))
\\
\\
{\mathrm{trie}}_{A \times B} &: (A \times B \to X) \to (A \twoheadrightarrow (B \twoheadrightarrow X))
\\
{\left[\!\left[ {\mathrm{trie}}_{A \times B}(f) \right]\!\right]}_{A \times B} &= f
\\
\lambda (a,b).\; {\left[\!\left[ {\left[\!\left[ \mathrm{trie}_{A \times B}(f) \right]\!\right]}_{A}(a) \right]\!\right]}_{B}(b) &= f
\\
\lambda (a,b).\; {\left[\!\left[ {\left[\!\left[ \mathrm{trie}_{A \times B}(f) \right]\!\right]}_{A}(a) \right]\!\right]}_{B}(b) &= \lambda (a,b).\; f(a,b)
\\
{\left[\!\left[ {\left[\!\left[ \mathrm{trie}_{A \times B}(f) \right]\!\right]}_{A}(a) \right]\!\right]}_{B}(b) &= f(a, b)
\\
\lambda b.\; {\left[\!\left[ {\left[\!\left[ \mathrm{trie}_{A \times B}(f) \right]\!\right]}_{A}(a) \right]\!\right]}_{B}(b) &= \lambda b.\; f(a, b)
\\
{\left[\!\left[ {\left[\!\left[ \mathrm{trie}_{A \times B}(f) \right]\!\right]}_{A}(a) \right]\!\right]}_{B} &= \lambda b.\; f(a, b)
\\
{\left[\!\left[ \mathrm{trie}_{A \times B}(f) \right]\!\right]}_{A}(a) &= {\mathrm{trie}}_B(\lambda b.\; f(a, b))
\\
\lambda a.\; {\left[\!\left[ \mathrm{trie}_{A \times B}(f) \right]\!\right]}_{A}(a) &= \lambda a.\; {\mathrm{trie}}_B(\lambda b.\; f(a, b))
\\
{\left[\!\left[ \mathrm{trie}_{A \times B}(f) \right]\!\right]}_{A} &= \lambda a.\; {\mathrm{trie}}_B(\lambda b.\; f(a, b))
\\
\mathrm{trie}_{A \times B}(f) &= {\mathrm{trie}}_A(\lambda a.\; {\mathrm{trie}}_B(\lambda b.\; f(a, b)))
\\
\\
{\mathrm{trie}}_{A \to B} &: ((A \to B) \to X) \to ((A \twoheadrightarrow B) \twoheadrightarrow X)
\\
{\left[\!\left[ {\mathrm{trie}}_{A \to B}(f) \right]\!\right]}_{A \to B} &= f
\\
\lambda g.\; {\left[\!\left[ {\mathrm{trie}}_{A \to B}(f) \right]\!\right]}_{A \to B}(g) &= \lambda g.\; f(g)
\\
\lambda g.\; {\left[\!\left[ \mathrm{trie}_{A \to B}(f) \right]\!\right]}_{A \twoheadrightarrow B}(\mathrm{trie}_A(g)) &= \lambda g.\; f(g)
\\
\lambda t.\; {\left[\!\left[ \mathrm{trie}_{A \to B}(f) \right]\!\right]}_{A \twoheadrightarrow B}(t) &= \lambda t.\; f({\left[\!\left[ t \right]\!\right]}_A)
\\
{\left[\!\left[ \mathrm{trie}_{A \to B}(f) \right]\!\right]}_{A \twoheadrightarrow B} &= \lambda t.\; f({\left[\!\left[ t \right]\!\right]}_A)
\\
\mathrm{trie}_{A \to B}(f) &= \mathrm{trie}_{A \twoheadrightarrow B}(\lambda t.\; f({\left[\!\left[ t \right]\!\right]}_A))
\end{align}
$$
\(\mathrm{adjust}\)
Let's now derive \(\mathrm{adjust}\) according to our previous specification, which I'll repeat here:
$$
\begin{align}
\mathrm{adjust}_A &: A \times (B \to B) \times (A \twoheadrightarrow B) \to (A \twoheadrightarrow B)
\\
\left[\!\left[ \mathrm{adjust}(a,h,t) \right]\!\right] &= \lambda a'.\; \begin{cases}
h(\left[\!\left[ t \right]\!\right](a')), & a' = a \\
\left[\!\left[ t \right]\!\right](a'), & \text{otherwise}
\end{cases}
\end{align}
$$
Since \(\mathrm{adjust}\) is a more complicated function, I left some of the more obvious steps implicit to save space. Despite my attempt to simplify, it's a pretty dense chunk of equational reasoning, so you would be forgiven if your eyes glaze over and your brain turns off. It's more important to me that you understand the process of deriving an implementation from a denotational specification than that you concern yourself with the tedious details of this specific example. Still, here it is.
$$
\begin{align}
\mathrm{adjust}_{\mathbf{0}} &: \mathbf{0} \times (X \to X) \times \mathbf{1} \to \mathbf{1}
\\
{\left[\!\left[ \mathrm{adjust}_{\mathbf{0}}(a,h,\star) \right]\!\right]}_{\mathbf{0}} &= \lambda a'.\; \begin{cases}
h({\left[\!\left[ t \right]\!\right]}_{\mathbf{0}}(a')), & a' = a \\
{\left[\!\left[ t \right]\!\right]}_{\mathbf{0}}(a'), & \text{otherwise}
\end{cases}
\\
{\left[\!\left[ \mathrm{adjust}_{\mathbf{0}}(a,h,\star) \right]\!\right]}_{\mathbf{0}} &= \mathrm{absurd}
\\
{\left[\!\left[ \mathrm{adjust}_{\mathbf{0}}(a,h,\star) \right]\!\right]}_{\mathbf{0}} &= {\left[\!\left[ \star \right]\!\right]}_{\mathbf{0}}
\\
\mathrm{adjust}_{\mathbf{0}}(a,h,\star) &= \star
\\
\\
\mathrm{adjust}_{\mathbf{1}} &: \mathbf{1} \times (X \to X) \times X \to X
\\
{\left[\!\left[ \mathrm{adjust}_{\mathbf{1}}(\star, h, x) \right]\!\right]}_{\mathbf{1}} &= \lambda a'.\; \begin{cases} h({\left[\!\left[ x \right]\!\right]}_{\mathbf{1}}(a')), & a' = \star \\ {\left[\!\left[ x \right]\!\right]}_{\mathbf{1}}(a'), & \text{otherwise} \end{cases}
\\
\lambda \star.\; \mathrm{adjust}_{\mathbf{1}}(\star, h, x) &= \lambda \star.\; h(x)
\\
\mathrm{adjust}_{\mathbf{1}}(\star, h, x) &= h(x)
\\
\\
\mathrm{adjust}_{A+B} &: (A + B) \times (X \to X) \times ((A \twoheadrightarrow X) \times (B \twoheadrightarrow X)) \to (A \twoheadrightarrow X) \times (B \twoheadrightarrow X)
\\
{\left[\!\left[ \mathrm{adjust}_{A+B}(\mathrm{inj}_1(a), h, (t_1, t_2)) \right]\!\right]}_{A+B} &= \lambda a'.\; \begin{cases} h({\left[\!\left[ (t_1,t_2) \right]\!\right]}_{A+B}(a')), & a' = \mathrm{inj}_1(a) \\ {\left[\!\left[ (t_1,t_2) \right]\!\right]}_{A+B}(a'), & \text{otherwise} \end{cases}
\\
{\left[\!\left[ \mathrm{adjust}_{A+B}(\mathrm{inj}_1(a), h, (t_1, t_2)) \right]\!\right]}_{A+B} &= \lambda \begin{cases} \mathrm{inj}_1(a') \mapsto \begin{cases} h({\left[\!\left[ t_1 \right]\!\right]}_A(a')), & a' = a \\ {\left[\!\left[ t_1 \right]\!\right]}_A(a'), & \text{otherwise} \end{cases} \\ \mathrm{inj}_2(b') \mapsto {\left[\!\left[ t_2 \right]\!\right]}_B(b') \end{cases}
\\
{\left[\!\left[ \mathrm{adjust}_{A+B}(\mathrm{inj}_1(a), h, (t_1, t_2)) \right]\!\right]}_{A+B} &= \lambda \begin{cases} \mathrm{inj}_1(a') \mapsto {\left[\!\left[ \mathrm{adjust}_A(a, h, t_1) \right]\!\right]}_A(a') \\ \mathrm{inj}_2(b') \mapsto {\left[\!\left[ t_2 \right]\!\right]}_B(b') \end{cases}
\\
{\left[\!\left[ \mathrm{adjust}_{A+B}(\mathrm{inj}_1(a), h, (t_1, t_2)) \right]\!\right]}_{A+B} &= {\left[\!\left[ (\mathrm{adjust}_A(a, h, t_1),\; t_2) \right]\!\right]}_{A+B}
\\
\mathrm{adjust}_{A+B}(\mathrm{inj}_1(a), h, (t_1, t_2)) &= (\mathrm{adjust}_A(a, h, t_1),\; t_2)
\\
\\
{\left[\!\left[ \mathrm{adjust}_{A+B}(\mathrm{inj}_2(b), h, (t_1, t_2)) \right]\!\right]}_{A+B} &= \lambda a'.\; \begin{cases} h({\left[\!\left[ (t_1,t_2) \right]\!\right]}_{A+B}(a')), & a' = \mathrm{inj}_2(b) \\ {\left[\!\left[ (t_1,t_2) \right]\!\right]}_{A+B}(a'), & \text{otherwise} \end{cases}
\\
{\left[\!\left[ \mathrm{adjust}_{A+B}(\mathrm{inj}_2(b), h, (t_1, t_2)) \right]\!\right]}_{A+B} &= \lambda \begin{cases} \mathrm{inj}_1(a') \mapsto {\left[\!\left[ t_1 \right]\!\right]}_A(a') \\ \mathrm{inj}_2(b') \mapsto \begin{cases} h({\left[\!\left[ t_2 \right]\!\right]}_B(b')), & b' = b \\ {\left[\!\left[ t_2 \right]\!\right]}_B(b'), & \text{otherwise} \end{cases} \end{cases}
\\
{\left[\!\left[ \mathrm{adjust}_{A+B}(\mathrm{inj}_2(b), h, (t_1, t_2)) \right]\!\right]}_{A+B} &= \lambda \begin{cases} \mathrm{inj}_1(a') \mapsto {\left[\!\left[ t_1 \right]\!\right]}_A(a') \\ \mathrm{inj}_2(b') \mapsto {\left[\!\left[ \mathrm{adjust}_B(b, h, t_2) \right]\!\right]}_B(b') \end{cases}
\\
{\left[\!\left[ \mathrm{adjust}_{A+B}(\mathrm{inj}_2(b), h, (t_1, t_2)) \right]\!\right]}_{A+B} &= {\left[\!\left[ (t_1,\; \mathrm{adjust}_B(b, h, t_2)) \right]\!\right]}_{A+B}
\\
\mathrm{adjust}_{A+B}(\mathrm{inj}_2(b), h, (t_1, t_2)) &= (t_1,\; \mathrm{adjust}_B(b, h, t_2))
\\
\\
\mathrm{adjust}_{A \times B} &: (A \times B) \times (X \to X) \times (A \twoheadrightarrow (B \twoheadrightarrow X)) \to (A \twoheadrightarrow (B \twoheadrightarrow X))
\\
{\left[\!\left[ \mathrm{adjust}_{A \times B}((a,b), h, t) \right]\!\right]}_{A \times B} &= \lambda (a',b').\; \begin{cases} h({\left[\!\left[ t \right]\!\right]}_{A \times B}(a',b')), & (a',b') = (a,b) \\ {\left[\!\left[ t \right]\!\right]}_{A \times B}(a',b'), & \text{otherwise} \end{cases}
\\
\lambda (a',b').\; {\left[\!\left[ {\left[\!\left[ \mathrm{adjust}_{A \times B}((a,b), h, t) \right]\!\right]}_A(a') \right]\!\right]}_B(b') &= \lambda (a',b').\; \begin{cases} h({\left[\!\left[ {\left[\!\left[ t \right]\!\right]}_A(a') \right]\!\right]}_B(b')), & a' = a \;\land\; b' = b \\ {\left[\!\left[ {\left[\!\left[ t \right]\!\right]}_A(a') \right]\!\right]}_B(b'), & \text{otherwise} \end{cases}
\\
{\left[\!\left[ {\left[\!\left[ \mathrm{adjust}_{A \times B}((a,b), h, t) \right]\!\right]}_A(a') \right]\!\right]}_B(b') &= \begin{cases} h({\left[\!\left[ {\left[\!\left[ t \right]\!\right]}_A(a') \right]\!\right]}_B(b')), & a' = a \;\land\; b' = b \\ {\left[\!\left[ {\left[\!\left[ t \right]\!\right]}_A(a') \right]\!\right]}_B(b'), & \text{otherwise} \end{cases}
\\
{\left[\!\left[ {\left[\!\left[ \mathrm{adjust}_{A \times B}((a,b), h, t) \right]\!\right]}_A(a') \right]\!\right]}_B(b') &= \begin{cases} \begin{cases} h({\left[\!\left[ {\left[\!\left[ t \right]\!\right]}_A(a) \right]\!\right]}_B(b')), & b' = b \\ {\left[\!\left[ {\left[\!\left[ t \right]\!\right]}_A(a) \right]\!\right]}_B(b'), & \text{otherwise} \end{cases}, & a' = a \\ {\left[\!\left[ {\left[\!\left[ t \right]\!\right]}_A(a') \right]\!\right]}_B(b'), & a' \neq a \end{cases}
\\
{\left[\!\left[ {\left[\!\left[ \mathrm{adjust}_{A \times B}((a,b), h, t) \right]\!\right]}_A(a') \right]\!\right]}_B(b') &= \begin{cases} {\left[\!\left[ \mathrm{adjust}_B(b, h, {\left[\!\left[ t \right]\!\right]}_A(a)) \right]\!\right]}_B(b'), & a' = a \\ {\left[\!\left[ {\left[\!\left[ t \right]\!\right]}_A(a') \right]\!\right]}_B(b'), & a' \neq a \end{cases}
\\
\lambda b'.\; {\left[\!\left[ {\left[\!\left[ \mathrm{adjust}_{A \times B}((a,b), h, t) \right]\!\right]}_A(a') \right]\!\right]}_B(b') &= \lambda b'.\; \begin{cases} {\left[\!\left[ \mathrm{adjust}_B(b, h, {\left[\!\left[ t \right]\!\right]}_A(a)) \right]\!\right]}_B(b'), & a' = a \\ {\left[\!\left[ {\left[\!\left[ t \right]\!\right]}_A(a') \right]\!\right]}_B(b'), & a' \neq a \end{cases}
\\
{\left[\!\left[ {\left[\!\left[ \mathrm{adjust}_{A \times B}((a,b), h, t) \right]\!\right]}_A(a') \right]\!\right]}_B &= \begin{cases} {\left[\!\left[ \mathrm{adjust}_B(b, h, {\left[\!\left[ t \right]\!\right]}_A(a)) \right]\!\right]}_B, & a' = a \\ {\left[\!\left[ {\left[\!\left[ t \right]\!\right]}_A(a') \right]\!\right]}_B, & a' \neq a \end{cases}
\\
{\left[\!\left[ \mathrm{adjust}_{A \times B}((a,b), h, t) \right]\!\right]}_A(a') &= \begin{cases} \mathrm{adjust}_B(b, h, {\left[\!\left[ t \right]\!\right]}_A(a)), & a' = a \\ {\left[\!\left[ t \right]\!\right]}_A(a'), & a' \neq a \end{cases}
\\
{\left[\!\left[ \mathrm{adjust}_{A \times B}((a,b), h, t) \right]\!\right]}_A(a') &= \begin{cases} (\lambda u.\; \mathrm{adjust}_B(b, h, u))({\left[\!\left[ t \right]\!\right]}_A(a')), & a' = a \\ {\left[\!\left[ t \right]\!\right]}_A(a'), & a' \neq a \end{cases}
\\
\lambda a'.\; {\left[\!\left[ \mathrm{adjust}_{A \times B}((a,b), h, t) \right]\!\right]}_A(a') &= \lambda a'.\; \begin{cases} (\lambda u.\; \mathrm{adjust}_B(b, h, u))({\left[\!\left[ t \right]\!\right]}_A(a')), & a' = a \\ {\left[\!\left[ t \right]\!\right]}_A(a'), & a' \neq a \end{cases}
\\
{\left[\!\left[ \mathrm{adjust}_{A \times B}((a,b), h, t) \right]\!\right]}_A &= {\left[\!\left[ \mathrm{adjust}_A(a, \;\lambda u.\; \mathrm{adjust}_B(b, h, u),\; t) \right]\!\right]}_A
\\
\mathrm{adjust}_{A \times B}((a,b), h, t) &= \mathrm{adjust}_A(a, \;\lambda u.\; \mathrm{adjust}_B(b, h, u),\; t)
\\
\\
\mathrm{adjust}_{A \to B} &: (A \to B) \times (X \to X) \times ((A \twoheadrightarrow B) \twoheadrightarrow X) \to ((A \twoheadrightarrow B) \twoheadrightarrow X)
\\
{\left[\!\left[ \mathrm{adjust}_{A \to B}(f, h, t) \right]\!\right]}_{A \to B} &= \lambda g.\; \begin{cases} h({\left[\!\left[ t \right]\!\right]}_{A \to B}(g)), & g = f \\ {\left[\!\left[ t \right]\!\right]}_{A \to B}(g), & \text{otherwise} \end{cases}
\\
\lambda g.\; {\left[\!\left[ \mathrm{adjust}_{A \to B}(f, h, t) \right]\!\right]}_{A \twoheadrightarrow B}(\mathrm{trie}_A(g)) &= \lambda g.\; \begin{cases} h({\left[\!\left[ t \right]\!\right]}_{A \twoheadrightarrow B}(\mathrm{trie}_A(g))), & g = f \\ {\left[\!\left[ t \right]\!\right]}_{A \twoheadrightarrow B}(\mathrm{trie}_A(g)), & \text{otherwise} \end{cases}
\\
\lambda s.\; {\left[\!\left[ \mathrm{adjust}_{A \to B}(f, h, t) \right]\!\right]}_{A \twoheadrightarrow B}(s) &= \lambda s.\; \begin{cases} h({\left[\!\left[ t \right]\!\right]}_{A \twoheadrightarrow B}(s)), & s = \mathrm{trie}_A(f) \\ {\left[\!\left[ t \right]\!\right]}_{A \twoheadrightarrow B}(s), & \text{otherwise} \end{cases}
\\
{\left[\!\left[ \mathrm{adjust}_{A \to B}(f, h, t) \right]\!\right]}_{A \twoheadrightarrow B} &= {\left[\!\left[ \mathrm{adjust}_{A \twoheadrightarrow B}(\mathrm{trie}_A(f), h, t) \right]\!\right]}_{A \twoheadrightarrow B}
\\
\mathrm{adjust}_{A \to B}(f, h, t) &= \mathrm{adjust}_{A \twoheadrightarrow B}(\mathrm{trie}_A(f), h, t)
\end{align}
$$
Credit Where It Is Due
I'm not doing anything fundamentally novel.
- Ralf Hinze pioneered the work on polytypic memo functions in a couple papers.
- Conal Elliott, inspired by a now-lost implementation by Spencer Janssen, developed a Haskell package and extended the idea in a series of blog posts.
- MemoTrie package on Hackage
- Elegant memoization with functional memo tries (the first of the blog post series)
The main difference between the above works and mine is that I include the update function adjust. My presentation is a bit different as well, but that's pretty much it.
Coming Up
In Part 3, I'll write a basic Haskell implementation, demonstrate it in action, and provide some visualizations to help see what's going on.
Member discussion