1032 lines
26 KiB
Markdown
1032 lines
26 KiB
Markdown
|
---
|
||
|
title: Functional Data Structures
|
||
|
subtitle: Analysis and Implementation
|
||
|
author: Levi Pearson
|
||
|
...
|
||
|
|
||
|
# Intro
|
||
|
|
||
|
### Outline
|
||
|
|
||
|
1. Basics of Functional Data Structures
|
||
|
* Immutability
|
||
|
* Persistent vs. Ephemeral
|
||
|
* Node Structure
|
||
|
* Data Sharing
|
||
|
|
||
|
2. Analysis Techniques
|
||
|
* Amortization Overview
|
||
|
* Amortization with Persistence
|
||
|
|
||
|
3. Design Techniques
|
||
|
* Lazy Rebuilding and Scheduling
|
||
|
* Numerical Representations
|
||
|
* Non-Uniform Structure
|
||
|
|
||
|
4. Real-World Examples
|
||
|
* Zippers
|
||
|
* 2-3 Finger Trees
|
||
|
* Hash Array Mapped Tries
|
||
|
|
||
|
### Motivation
|
||
|
|
||
|
Functional data structures provide:
|
||
|
|
||
|
* Relatively simple implementations
|
||
|
* Good worst-case performance bounds
|
||
|
* Persistence for free
|
||
|
* Ease of use in concurrent programs
|
||
|
|
||
|
## Basics
|
||
|
|
||
|
### Functional Data Structures?
|
||
|
|
||
|
What do we mean by "Functional Data Structure"?
|
||
|
|
||
|
* Built from immutable values
|
||
|
* Can be implemented in a purely functional language
|
||
|
* No in-place updates!
|
||
|
|
||
|
Obviously not the data structures taught in Intro to Algorithms class...
|
||
|
|
||
|
### Persistence
|
||
|
|
||
|
Most imperative structures are *ephemeral*; updates destroy previous versions
|
||
|
|
||
|
A data structure is *persistent* when old versions remain after updates
|
||
|
|
||
|
* Developed in the context of imperative structures
|
||
|
* Imperative versions of persistent structures can be quite complex
|
||
|
* Adding persistence can change performance bounds
|
||
|
* Purely functional structures are automatically persistent!
|
||
|
|
||
|
### Value of Persistence
|
||
|
|
||
|
Why would we want persistence anyway?
|
||
|
|
||
|
* Safe data sharing
|
||
|
* No locking required
|
||
|
* Restoration of previous states
|
||
|
* Interesting algorithms (e.g. in computational geometry)
|
||
|
|
||
|
### Node-Based Structure
|
||
|
|
||
|
Functional data structures share structural characteristics:
|
||
|
|
||
|
* Built from *nodes* and *references*
|
||
|
* Nodes contain references:
|
||
|
* one or more references to other nodes
|
||
|
* one or more references to values
|
||
|
* A structure may be built from one or more *kinds* of node
|
||
|
|
||
|
Examples:
|
||
|
|
||
|
~~~ {.haskell}
|
||
|
data List a = Nil
|
||
|
| Cons a (List a)
|
||
|
|
||
|
data Tree a = Leaf a
|
||
|
| Branch (Tree a) (Tree a)
|
||
|
~~~
|
||
|
|
||
|
### Data Sharing
|
||
|
|
||
|
The fine-grained node structure has some benefits:
|
||
|
|
||
|
* New versions of structures share the bulk of their data with old versions
|
||
|
* Copying on updates is limited to *nodes* rather than *values*
|
||
|
* Only the nodes on the path from the root to change point are copied.
|
||
|
|
||
|
Also some drawbacks:
|
||
|
|
||
|
* Extra overhead for references vs. in-place storage
|
||
|
* Locality of data can suffer
|
||
|
|
||
|
# Analysis
|
||
|
|
||
|
## Amortization
|
||
|
|
||
|
### Structures Enable Algorithms
|
||
|
|
||
|
* Data structures support *operations* on them
|
||
|
* Operations are implemented via *algorithms*
|
||
|
* Algorithms perform operations
|
||
|
|
||
|
We bottom out at two primitive operations:
|
||
|
|
||
|
* Construction of an ADT value
|
||
|
* Destructuring/Pattern Matching of an ADT value
|
||
|
|
||
|
### Algorithm comparison
|
||
|
|
||
|
We judge between algorithms based on analysis:
|
||
|
|
||
|
* How many primitive operations are needed to complete the algorithm?
|
||
|
* How does the number of operations scale as the problem size grows?
|
||
|
|
||
|
*Asymptotic analysis* gives an upper bound to the growth of operation
|
||
|
count as the size grows to infinity.
|
||
|
|
||
|
### Worst Case
|
||
|
|
||
|
Typical analysis is *worst-case*:
|
||
|
|
||
|
* Find a function such that some constant times that function of the
|
||
|
input size is *always* greater than the actual count of operations
|
||
|
for any single operation.
|
||
|
* That function is a worst-case upper bound.
|
||
|
* The smallest upper bound is how we characterize the asymptotic
|
||
|
worst-case behavior.
|
||
|
|
||
|
### Average Case
|
||
|
|
||
|
Sometimes we consider *average-case*:
|
||
|
|
||
|
* Find the upper bounds of all possible single operation cases.
|
||
|
* Model the statistical likelihood of the cases.
|
||
|
* Find the smallest upper bound of the weighted combination of cases.
|
||
|
|
||
|
### Amortized
|
||
|
|
||
|
An *amortized* analysis is different:
|
||
|
|
||
|
* Find the upper bounds of all possible single operation cases.
|
||
|
* Consider the possible *sequences* of operations.
|
||
|
* Show that *expensive* operations always run infrequently enough to
|
||
|
distribute their cost among *cheap* operations
|
||
|
|
||
|
### When To Amortize?
|
||
|
|
||
|
A **worst-case** analysis is necessary when you have hard latency bounds,
|
||
|
e.g. real-time requirements.
|
||
|
|
||
|
**Average** or **amortized** analysis can give a better sense of
|
||
|
*expected throughput* of an algorithm.
|
||
|
|
||
|
**Amortized** analysis is useful when most operations are cheap, but
|
||
|
occasionally an expensive one must be performed.
|
||
|
|
||
|
Examples:
|
||
|
|
||
|
* Array doubling
|
||
|
* Rehashing
|
||
|
* Tree balancing
|
||
|
|
||
|
Developed in the context of *ephemeral* data structures.
|
||
|
|
||
|
### The Persistence Wrench
|
||
|
|
||
|
An ephemeral structure has a single logical future: Once an operation
|
||
|
is performed, there's no going back.
|
||
|
|
||
|
The amortization accounting process is straightforward:
|
||
|
|
||
|
* Assign an extra *credit* cost to cheap cases.
|
||
|
* Prove that an expensive case only occurs after sufficient credit
|
||
|
has accumulated to pay for it.
|
||
|
|
||
|
Persistent structures may have *multiple* logical futures.
|
||
|
|
||
|
Consider a tree very close to needing an expensive rebalancing:
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
-- tr1 refers to the teetering tree
|
||
|
let tr2 = insert foo tr1 -- causes a rebalance: expensive
|
||
|
let tr3 = insert foo tr1 -- causes another rebalance
|
||
|
let tr4 = insert foo tr1 -- causes yet another rebalance
|
||
|
~~~
|
||
|
|
||
|
How to account for that? You can't spend your savings more than once.
|
||
|
|
||
|
## Functional Amortization
|
||
|
|
||
|
### Tackling Persistence with Laziness
|
||
|
|
||
|
Chris Okasaki discovered a way around the multiple-future problem:
|
||
|
|
||
|
* Delay expensive operations via lazy evaluation
|
||
|
* Account for expensive operations via *debt*
|
||
|
* Savings can only be spent once, but over-paying on debt is ok
|
||
|
|
||
|
This works because:
|
||
|
|
||
|
* A suspended computation places a *thunk* in the new structure
|
||
|
* When a thunk is *forced*, it performs its calculation
|
||
|
* The value replaces the thunk *in place*
|
||
|
|
||
|
The previous example, with lazy evaluation:
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
-- tr1 refers to the teetering tree
|
||
|
let tr2 = insert foo tr1 -- creates a rebalancing thunk: cheap
|
||
|
let tr3 = insert bar tr2 -- forces rebalancing: expensive
|
||
|
let tr4 = insert baz tr2 -- sees previous rebalancing
|
||
|
~~~
|
||
|
|
||
|
### Outline of Amortized Lazy Analysis
|
||
|
|
||
|
Like a layaway plan:
|
||
|
|
||
|
1. Find a computation you can't afford to execute
|
||
|
2. Create a suspension for it, assign a value proportional to its shared cost
|
||
|
3. Pay the debt off a little at a time
|
||
|
4. When the debt is paid off, the suspension may be executed
|
||
|
|
||
|
Consider each logical future as if it were the *only one*.
|
||
|
|
||
|
Each future must fully pay before invoking any suspension.
|
||
|
|
||
|
### Restoring Worst-Case Bounds
|
||
|
|
||
|
Sometimes the latency budget requires a tight worst-case bound on all
|
||
|
operations.
|
||
|
|
||
|
Worst-case bounds can be restored via *scheduling*
|
||
|
|
||
|
* Execute suspended computations in bounded steps
|
||
|
* Execute the steps as they are paid off
|
||
|
|
||
|
This has a small space and complexity overhead, but bounds the latency.
|
||
|
|
||
|
### Using Analysis Results
|
||
|
|
||
|
Sure that sounds like *fun*, but what's it good for?
|
||
|
|
||
|
Different operations over structures (e.g. head-insert, tail-insert,
|
||
|
search) are necessary to implement higher-level algorithms.
|
||
|
|
||
|
Choose the data structure that:
|
||
|
|
||
|
* Provides good bounds for the operations your algorithm needs
|
||
|
* Provides the latency guarantees you need
|
||
|
|
||
|
Other factors as well, of course
|
||
|
|
||
|
# Implementation Patterns
|
||
|
|
||
|
## Lazy Rebuilding
|
||
|
|
||
|
### Queues, Take 1
|
||
|
|
||
|
The essence of the Queue:
|
||
|
|
||
|
* `insert` adds an element at the end of the queue
|
||
|
* `remove` takes an element from the front of the queue
|
||
|
|
||
|
How to represent it? What about a List?
|
||
|
|
||
|
* Our `remove` operation is $O \left({1}\right)$
|
||
|
* But `insert` is $O \left({n}\right)$!
|
||
|
|
||
|
The standard trick: A pair of lists, `left` and `right`.
|
||
|
|
||
|
* The `left` list holds the head of the queue
|
||
|
* The `right` list holds the tail, stored in reversed order
|
||
|
|
||
|
When `left` becomes empty, reverse `right` and put it in `left`.
|
||
|
|
||
|
### Queue Implementation
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
module Queue1 where
|
||
|
import Prelude hiding (reverse, length)
|
||
|
import StrictList (head, tail, reverse)
|
||
|
data Queue a = Q { left :: StrictList a, leftLen :: !Int
|
||
|
, right :: StrictList a, rightLen :: !Int
|
||
|
} deriving (Show, Eq)
|
||
|
empty :: Queue a
|
||
|
empty = Q Empty 0 Empty 0
|
||
|
length :: Queue a -> Int
|
||
|
length (Q _ ll _ rl) = ll + rl
|
||
|
insert :: a -> Queue a -> Queue a
|
||
|
insert e (Q l ll r rl) = Q l ll (e :$ r) (rl + 1)
|
||
|
remove :: Queue a -> (a, Queue a)
|
||
|
remove (Q Empty _ r rl) = remove (Q (reverse r) rl Empty 0)
|
||
|
remove (Q l ll r rl) = (head l, Q (tail l) (ll - 1) r rl)
|
||
|
~~~
|
||
|
|
||
|
### Queue Analysis
|
||
|
|
||
|
Most operations are trivial:
|
||
|
|
||
|
* `empty`: $O \left({1}\right)$
|
||
|
* `insert`: $O \left({1}\right)$
|
||
|
* `length`: $O \left({1}\right)$
|
||
|
|
||
|
`remove` will take some thought
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
remove :: Queue a -> (a, Queue a)
|
||
|
remove (Q Empty _ r rl) = remove (Q (reverse r) rl Empty 0)
|
||
|
remove (Q l ll r rl) = (head l, Q (tail l) (ll - 1) r rl)
|
||
|
~~~
|
||
|
|
||
|
* When `left` is non-empty, `remove` takes $O \left({1}\right)$
|
||
|
* But when empty, it takes $O \left({n}\right)$ due to the `reverse`
|
||
|
|
||
|
### Amortizing the Queue
|
||
|
|
||
|
* The `reverse` operation takes place only when `left` is empty
|
||
|
|
||
|
* `left` is empty when:
|
||
|
* The entire queue is empty
|
||
|
* `n` values have been `insert`ed without a `remove`
|
||
|
* All previously reversed values have been `remove`d.
|
||
|
|
||
|
* A `remove` after `n` `inserts` will take `n` operations.
|
||
|
|
||
|
* Those `n` values must be `remove`d before another `reverse` can
|
||
|
occur.
|
||
|
|
||
|
* So, assigning extra credit to each `insert` should offset the linear
|
||
|
cost of the `reverse`...
|
||
|
|
||
|
### Not So Fast
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
let q1 = foldr insert empty "Abracadabra"
|
||
|
let q2 = remove q1 -- Causes a reverse
|
||
|
let q3 = remove q1 -- Causes the same reverse
|
||
|
let q4 = remove q1 -- And again...
|
||
|
~~~
|
||
|
|
||
|
Oops. Persistence screws up our amortization scheme.
|
||
|
|
||
|
Time to make it lazy.
|
||
|
|
||
|
### Queue Take 2 - Paired Lazy Lists
|
||
|
|
||
|
Goal: Reverse `right` *incrementally* before `left` becomes empty.
|
||
|
|
||
|
Means: Periodically append `reverse right` to `left` using *guarded recursion*.
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
-- rot l r [] = l ++ reverse r
|
||
|
rot [] r a = head r : a
|
||
|
rot l r a = head l : rot (tail l) (tail r) (head r : a)
|
||
|
~~~
|
||
|
|
||
|
This evaluates up to the (`:`) operation per call to `head` on the
|
||
|
result.
|
||
|
|
||
|
### Queue Take 2 - Paired Lazy Lists (cont.)
|
||
|
|
||
|
Forcing a tail:
|
||
|
|
||
|
* performs one step of the `++` and one step of the `reverse`.
|
||
|
* may also force another tail; but only $\log n$ times.
|
||
|
|
||
|
So worst case improves from $O\left({n}\right)$ to $O\left({\log n}\right)$.
|
||
|
|
||
|
If we never let `right` be longer than `left`, by the time `rotate`
|
||
|
steps through the original `left`, the `reverse` will be complete.
|
||
|
|
||
|
Amortized cost is therefore $O \left({1}\right)$ again, and safe for
|
||
|
persistence.
|
||
|
|
||
|
### Full Implementation
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
module Queue2 where
|
||
|
import Prelude hiding (length)
|
||
|
data Queue a = Q { left :: [a], leftLen :: !Int
|
||
|
, right :: [a], rightLen :: !Int
|
||
|
} deriving (Show, Eq)
|
||
|
empty :: Queue a
|
||
|
empty = Q [] 0 [] 0
|
||
|
length :: Queue a -> Int
|
||
|
length (Q _ ll _ rl) = ll + rl
|
||
|
insert :: a -> Queue a -> Queue a
|
||
|
insert e (Q l ll r rl) = makeQ (Q l ll (e : r) (rl + 1))
|
||
|
remove :: Queue a -> (a, Queue a)
|
||
|
remove (Q l ll r rl) = (head l, makeQ (Q (tail l) (ll-1) r rl))
|
||
|
makeQ :: Queue a -> Queue a -- Preserve invariant: |R| <= |L|
|
||
|
makeQ q@(Q l ll r rl)
|
||
|
| rl <= ll = q
|
||
|
| otherwise = Q (rot l r []) (ll + rl) [] 0
|
||
|
rot :: [a] -> [a] -> [a] -> [a]
|
||
|
rot [] r a = head r : a
|
||
|
rot l r a = head l : rot (tail l) (tail r) (head r : a)
|
||
|
~~~
|
||
|
|
||
|
## Numerical Representations
|
||
|
|
||
|
### Arithmetic
|
||
|
|
||
|
> Many people regard arithmetic as a trivial thing that children learn
|
||
|
> and computers do, but we will see that arithmetic is a fascinating
|
||
|
> topic with many interesting facets. It is important to make a
|
||
|
> thorough study of efficient methods for calculating with numbers,
|
||
|
> since arithmetic underlies so many computer applications.
|
||
|
|
||
|
Donald E. Knuth, from "The Art of Computer Programming"
|
||
|
|
||
|
### Nat and List
|
||
|
|
||
|
Inductive definition of Naturals:
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
data Nat = Zero
|
||
|
| Succ Nat
|
||
|
~~~
|
||
|
|
||
|
Inductive definition of Lists:
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
data List a = Nil
|
||
|
| Cons a (List a)
|
||
|
~~~
|
||
|
|
||
|
Functions on `Nat` and `List` are analogous too:
|
||
|
|
||
|
* `inc` of a `Nat` is like `cons` on a `List`
|
||
|
* `dec` of a `Nat` is like removing `head` of a `List`
|
||
|
* adding two `Nat`s is like combining two `List`s
|
||
|
|
||
|
Lists are *numbers* that contain *values*.
|
||
|
|
||
|
`Nat` and `List` implement a *unary* number system.
|
||
|
|
||
|
### Positional Numbers - Definitions
|
||
|
|
||
|
A *positional number* is written as a series of digits:
|
||
|
$$
|
||
|
b_0 \dots b_{m-1}
|
||
|
$$
|
||
|
The digit $b_0$ is the *least significant digit*
|
||
|
|
||
|
The digit $b_{m-1}$ is the *most significant digit*
|
||
|
|
||
|
Each digit $b_i$ has a *weight* $w_i$, so the value is:
|
||
|
$$
|
||
|
\sum\limits_{i=0}^{m-1} b_i w_i
|
||
|
$$
|
||
|
A number is *base* $B$ if $w_i=B^i$ and $D_i=\left\{0,\dots,B-1\right\}$.
|
||
|
|
||
|
Usually weights are increasing sequences of powers and $D_i$ is the
|
||
|
same for every digit.
|
||
|
|
||
|
### Redundancy
|
||
|
|
||
|
A number system is *redundant* if there is more than one way to
|
||
|
represent some numbers.
|
||
|
|
||
|
Take the system where $w_i = 2^i$ and $D_i = \left\{0,1,2\right\}$
|
||
|
|
||
|
Decimal $13$ can be written:
|
||
|
|
||
|
* $1011$
|
||
|
* $1201$
|
||
|
* $122$
|
||
|
|
||
|
### Dense and Sparse Representations
|
||
|
|
||
|
*Dense* representations are simple lists/sequences of digits,
|
||
|
including $0$.
|
||
|
|
||
|
*Sparse* representations exclude $0$ digits, so must include either:
|
||
|
|
||
|
* *rank* (the index in the sequence) or
|
||
|
* *weight*
|
||
|
|
||
|
### Dense Example
|
||
|
|
||
|
Straightforward list of binary digits, but least significant bit first
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
data Digit = Zero | One
|
||
|
type DenseNat = [Digit] -- increasing order of significance
|
||
|
inc [] = [One]
|
||
|
inc (Zero : ds) = One : ds
|
||
|
inc (One : ds) = Zero : inc ds -- carry
|
||
|
dec [One] = []
|
||
|
dec (One : ds) = Zero : ds
|
||
|
dec (Zero : ds) = One : dec ds -- borrow
|
||
|
add ds [] = ds
|
||
|
add [] ds = ds
|
||
|
add (d : ds1) (Zero : ds2) = d : add ds1 ds2
|
||
|
add (Zero : ds1) (d : ds2) = d : add ds1 ds2
|
||
|
add (One : ds1) (One : ds2) = Zero : inc (add ds1 ds2) -- carry
|
||
|
~~~
|
||
|
|
||
|
### Sparse Example
|
||
|
|
||
|
Store a list of $d_i w_i$ values, so `Zero` digits won't appear
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
type SparseNat = [Int] -- increasing list of powers of 2
|
||
|
carry w [] = [w]
|
||
|
carry w ws@(w' : rest)
|
||
|
| w < w' = w : ws
|
||
|
| otherwise = carry (2 * w) rest
|
||
|
borrow w ws@(w' : rest)
|
||
|
| w < w' = rest
|
||
|
| otherwise = w : borrow (2 * w) ws
|
||
|
inc ws = carry 1 ws
|
||
|
dec ws = borrow 1 ws
|
||
|
add ws [] = ws
|
||
|
add [] ws = ws
|
||
|
add m@(w1 : ws1) n@(w2 : ws2)
|
||
|
| w1 < w2 = w1 : add ws1 n
|
||
|
| w2 < w1 = w2 : add m ws2
|
||
|
| otherwise = carry (2 * w1) (add ws1 ws2)
|
||
|
~~~
|
||
|
|
||
|
### Binary Numerical Representations
|
||
|
|
||
|
We can transform a positional number system into a data structure:
|
||
|
|
||
|
* Replace sequence of digits with a sequence of trees
|
||
|
* Number and size of trees is determined by representation of numbers
|
||
|
|
||
|
For example:
|
||
|
|
||
|
* The binary representation of $73$ is $1001001$
|
||
|
* A collection of size $73$ would contain three trees:
|
||
|
* size $1$
|
||
|
* size $8$
|
||
|
* size $64$
|
||
|
|
||
|
### Complete Binary Leaf Trees
|
||
|
|
||
|
A binary leaf tree is either:
|
||
|
|
||
|
* a `Leaf`, containing a value; or
|
||
|
* a `Fork`, containing two binary leaf trees
|
||
|
|
||
|
A *complete* binary leaf tree has all `Leaf` nodes at the same rank.
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
data BLT a = Leaf a
|
||
|
| Fork (BLT a) (BLT a)
|
||
|
~~~
|
||
|
|
||
|
### Binary Random-Access List
|
||
|
|
||
|
Let's combine a dense binary representation with a `BLT`
|
||
|
|
||
|
First, let's annotate the `BLT`s with their size:
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
data BLT a = Leaf a
|
||
|
| Fork !Int (BLT a) (BLT a) deriving (Show)
|
||
|
~~~
|
||
|
|
||
|
Recall the types we used for `DenseNat`:
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
data Digit = Zero | One
|
||
|
deriving (Show, Ord, Eq)
|
||
|
type DenseNat = [Digit] -- increasing order of significance
|
||
|
~~~
|
||
|
|
||
|
We'll adapt them to store a `BLT` on `One` digits:
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
data Digit a = Zero | One (BLT a)
|
||
|
deriving (Show, Ord, Eq)
|
||
|
type RandList a = [Digit a]
|
||
|
~~~
|
||
|
|
||
|
### Some RandList Operations
|
||
|
|
||
|
We had an `inc` operation for `Nat`
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
inc :: DenseNat -> DenseNat
|
||
|
inc [] = [One]
|
||
|
inc (Zero : ds) = One : ds
|
||
|
inc (One : ds) = Zero : inc ds -- carry
|
||
|
~~~
|
||
|
|
||
|
We adapt it to `cons` for `RandList`
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
cons :: a -> RandList a -> RandList a
|
||
|
cons x ts = insTree (Leaf x) ts
|
||
|
|
||
|
insTree t [] = [One t]
|
||
|
insTree t (Zero : ts) = One t : ts
|
||
|
insTree t1 (One t2 : ts) = Zero : insTree (link t1 t2) ts
|
||
|
where link t1 t2 = Fork (size t1 + size t2) t1 t2
|
||
|
size (Leaf _) = 1
|
||
|
size (Fork n _ _ ) = n
|
||
|
~~~
|
||
|
|
||
|
### Some RandList Operations
|
||
|
|
||
|
We had a `dec` operation for `Nat`
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
dec :: DenseNat -> DenseNat
|
||
|
dec [One] = []
|
||
|
dec (One : ds) = Zero : ds
|
||
|
dec (Zero : ds) = One : dec ds -- borrow
|
||
|
~~~
|
||
|
|
||
|
We adapt it to `tail` for `RandList`
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
tail :: RandList a -> RandList a
|
||
|
tail ts = ts' where (_, ts') = borrowTree ts
|
||
|
|
||
|
borrowTree [One t] = (t, [] )
|
||
|
borrowTree (One t : ts) = (t, Zero : ts )
|
||
|
borrowTree (Zero : ts) = (t1, One t2 : ts')
|
||
|
where (Fork _ t1 t2, ts') = borrowTree ts
|
||
|
~~~
|
||
|
|
||
|
### Properties of RandList
|
||
|
|
||
|
We have arranged an $n$-length sequence as a $\log\left(n+1\right)$
|
||
|
length list of trees of $\log n$ depth.
|
||
|
|
||
|
Operations `cons`, `head`, and `tail` perform $O\left({1}\right)$
|
||
|
work per digit, so their worst-case is $O\left({\log n}\right)$
|
||
|
|
||
|
Operations `lookup` and `update` take at most $O\left({\log n}\right)$
|
||
|
to find the right tree and $O\left({\log n}\right)$ to find the right
|
||
|
element, for a total of $O\left({\log n}\right)$ worst-case time.
|
||
|
|
||
|
For mostly-random access, this will be a significant improvement over
|
||
|
`List`'s $O\left({n}\right)$ worst-case performance.
|
||
|
|
||
|
### Skew Binary Numbers
|
||
|
|
||
|
Our standard binary representation's performance was hindered by
|
||
|
*cascading carries*.
|
||
|
|
||
|
In the **skew binary number** representation:
|
||
|
|
||
|
* The weight $w_i$ of the $i$th digit is $2^{i+1}-1$
|
||
|
* The set of digits $D_i = \left\{0,1,2\right\}$
|
||
|
* Only the lowest non-$0$ digit may be $2$ (for uniqueness)
|
||
|
|
||
|
The number $92$ is $002101$ in this representation:
|
||
|
|
||
|
\begin{align*}
|
||
|
\sum\limits_{i=0}^{5} b_i w_i &=
|
||
|
0+0+(2^{2+1}-1)\times{2}+(2^{3+1}-1)+0+(2^{5+1}-1) \\
|
||
|
&= 14+15+63 \\
|
||
|
&= 92
|
||
|
\end{align*}
|
||
|
|
||
|
### Skew Binary Number Operations
|
||
|
|
||
|
What makes skew binary representation useful:
|
||
|
|
||
|
* $w_i$ is $2^{i+1}-1$
|
||
|
* $1+2\left(2^{i+1}-1\right) = 2^{i+2}-1$
|
||
|
|
||
|
This means that when the lowest non-$0$ digit is $2$, the `inc`
|
||
|
operation:
|
||
|
|
||
|
* resets the $2$ to $0$
|
||
|
* increments the next digit from $0$ to $1$ *or* $1$ to $2$
|
||
|
|
||
|
If there is no $2$, just increment the lowest digit from $0$ to $1$ or
|
||
|
from $1$ to $2$. These only take $O\left({1}\right)$ time!
|
||
|
|
||
|
We must use a *sparse* representation to keep $O\left({1}\right)$
|
||
|
access to the first non-$0$ digit.
|
||
|
|
||
|
### Skew Binary Random Access List
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
data BTree a = Leaf a | Node a (BTree a) (BTree a)
|
||
|
data Digit a = Digit !Int (BTree a)
|
||
|
type RList a = [Digit a]
|
||
|
cons x (Digit w1 t1 : Digit w2 t2 : ts')
|
||
|
| w1 == w2 = Digit (1 + w1 + w2) (Node x t1 t2) : ts'
|
||
|
cons x ts = Digit 1 (Leaf x) : ts
|
||
|
head (Digit 1 (Leaf x) : _) = x
|
||
|
head (Digit _ (Node x _ _) : _) = x
|
||
|
tail (Digit 1 (Leaf _) : ts) = ts
|
||
|
tail (Digit w (Node _ t1 t2) : ts) =
|
||
|
Digit w' t1 : Digit w' t2 : ts
|
||
|
where w' = w `div` 2
|
||
|
lookup (Digit w t : ts) i
|
||
|
| i < w = lookupTree w t i
|
||
|
| otherwise = lookup ts (i - w)
|
||
|
lookupTree 1 (Leaf x) 0 = x
|
||
|
lookupTree _ (Node x _ _) 0 = x
|
||
|
lookupTree w (Node _ t1 t2) i
|
||
|
| i < w' = lookupTree w' t1 (i-1)
|
||
|
| otherwise = lookupTree w' t2 (i-1-w')
|
||
|
where w' = w `div` 2
|
||
|
~~~
|
||
|
|
||
|
## Non-Uniform Structure
|
||
|
|
||
|
### Random Access Lists
|
||
|
|
||
|
Here is our first Random Access List type:
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
data BLT a = Leaf a
|
||
|
| Fork (BLT a) (BLT a)
|
||
|
data Digit a = Zero | One (BLT a)
|
||
|
type RandList a = [Digit a]
|
||
|
~~~
|
||
|
|
||
|
It is possible to construct invalid members of this type:
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
[One (Fork (Leaf 1) (Fork (Leaf 2) (Leaf 3)))]
|
||
|
~~~
|
||
|
|
||
|
Constraint violations:
|
||
|
|
||
|
* All trees should be *complete*
|
||
|
* Incorrect height for its numerical representation
|
||
|
|
||
|
### Random Access Lists (cont.)
|
||
|
|
||
|
We can encode the fact that we only want *complete* trees of the
|
||
|
correct height by combining the structures and "pairing" the type
|
||
|
parameter on each recurrence:
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
data Fork t = Fork t t -- this is just like (t,t)
|
||
|
data RandList t = Nil
|
||
|
| Zero (RandList (Fork t))
|
||
|
| One t (RandList (Fork t))
|
||
|
~~~
|
||
|
|
||
|
This is known as *polymorphic recursion* or *non-uniform data types*
|
||
|
|
||
|
Some examples:
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
Nil
|
||
|
One 1 Nil
|
||
|
Zero (One (Fork 1 2) Nil)
|
||
|
Zero (Zero (One (Fork (Fork 1 2) (Fork 3 4)) Nil))
|
||
|
~~~
|
||
|
|
||
|
### Random Access List Implementation
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
data Fork t = Fork t t
|
||
|
data RandList t = Nil
|
||
|
| Zero (RandList (Fork t))
|
||
|
| One t (RandList (Fork t))
|
||
|
cons :: a -> RandList a -> RandList a
|
||
|
cons x Nil = One x Nil
|
||
|
cons x1 (Zero ds) = One x1 ds
|
||
|
cons x1 (One x2 ds) = Zero (cons (Fork x1 x2) ds)
|
||
|
front :: RandList a -> (a, RandList a)
|
||
|
front Nil = error "Empty RandList"
|
||
|
front (One x Nil) = (x, Nil)
|
||
|
front (One x ds) = (x, Zero ds)
|
||
|
front (Zero ds) = (x1, One x2 ds')
|
||
|
where (Fork x1 x2, ds') = front ds
|
||
|
find :: Int -> RandList a -> a
|
||
|
find 0 Nil = error "Not found"
|
||
|
find 0 (One x _) = x
|
||
|
find i (One _ ds) = find (i-1) (Zero ds)
|
||
|
find i (Zero ds) = if i`mod`2 == 0 then x else y
|
||
|
where (Fork x y) = find (i`div`2) ds
|
||
|
~~~
|
||
|
|
||
|
### Other Numerical Representations
|
||
|
|
||
|
We have only scratched the surface of numerical representations
|
||
|
|
||
|
* Different tree structures at digits
|
||
|
* complete binary trees
|
||
|
* complete leaf trees
|
||
|
* binomial trees
|
||
|
* pennants
|
||
|
|
||
|
* Different number systems
|
||
|
* zeroless binary
|
||
|
* redundant binary
|
||
|
* skew binary
|
||
|
* segmented binary
|
||
|
* higher bases
|
||
|
* fibonacci numbers
|
||
|
* factorials - random permutation algorithm
|
||
|
|
||
|
# Real-World Structures
|
||
|
|
||
|
## Zippers
|
||
|
|
||
|
### What is a Zipper?
|
||
|
|
||
|
Imagine you have a large, complex data structure...
|
||
|
|
||
|
A zipper is a *complementary* structure that tracks a navigation path
|
||
|
through the primary structure.
|
||
|
|
||
|
It contains:
|
||
|
|
||
|
* a *focus point* that refers to the current point in the navigation
|
||
|
* the *path* of the traversal from the root to focus point
|
||
|
|
||
|
Operators on a zipper:
|
||
|
|
||
|
* navigation: `goUp`, `goDown`, `goLeft`, searching, etc.
|
||
|
* update: `replace` the focus point, `insertLeft`, etc.
|
||
|
|
||
|
Traversing a structure with a zipper turns it *inside-out*, and moving
|
||
|
back to the head pulls it back together, like a *zipper*.
|
||
|
|
||
|
### A Zipper on Rose Trees
|
||
|
|
||
|
First, a tree structure, possibly representing a document:
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
data Tree a = Item a | Section [Tree a]
|
||
|
~~~
|
||
|
|
||
|
We need a structure to record the path; we need to track steps up and
|
||
|
down the structure as well as the position in the list
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
data Path a = Top | Node [Tree a] (Path a) [Tree a]
|
||
|
~~~
|
||
|
|
||
|
Finally, we need to combine the path with the focus point, which will
|
||
|
just be a subtree of the tree we are traversing:
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
data Location a = Loc (Tree a) (Path a)
|
||
|
~~~
|
||
|
|
||
|
To build a zipper, we just combine the root of the tree with the `Top`
|
||
|
path constructor in a `Location`
|
||
|
|
||
|
### A Zipper on Rose Trees - Navigation
|
||
|
|
||
|
Basic navigation (error checking elided):
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
data Path a = Top | Node [Tree a] (Path a) [Tree a]
|
||
|
data Location a = Loc (Tree a) (Path a)
|
||
|
|
||
|
goDown (Loc (Section (t1:trees)) p) =
|
||
|
Loc t1 (Node [] p trees)
|
||
|
|
||
|
goLeft (Loc t (Node (l:left) up right)) =
|
||
|
Loc l (Node left up (t:right))
|
||
|
|
||
|
goRight (Loc t (Node left up (r:right))) =
|
||
|
Loc r (Node (t:left) up right)
|
||
|
|
||
|
goUp (Loc t (Node left up right)) =
|
||
|
Loc (Section (reverse left ++ t:right)) up
|
||
|
~~~
|
||
|
|
||
|
### A Zipper on Rose Trees - Update
|
||
|
|
||
|
Modifying at the current position:
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
data Path a = Top | Node [Tree a] (Path a) [Tree a]
|
||
|
data Location a = Loc (Tree a) (Path a)
|
||
|
|
||
|
replace (Loc _ p) t = Loc t p
|
||
|
|
||
|
insertRight (Loc t (Node left up right)) r =
|
||
|
Loc t (Node left up (r:right))
|
||
|
|
||
|
insertLeft (Loc t (Node left up right)) l =
|
||
|
Loc t (Node (l:left) up right)
|
||
|
|
||
|
insertDown (Loc (Section sons) p) t1 =
|
||
|
Loc t1 (Node [] p sons)
|
||
|
~~~
|
||
|
|
||
|
### Crazy Zipper Facts
|
||
|
|
||
|
The zipper structure for a given data structure can be derived
|
||
|
mechanically from its algebraic type signature.
|
||
|
|
||
|
The zipper type signature is the *first derivative* (in the sense you
|
||
|
learned in Calculus) of its parent signature.
|
||
|
|
||
|
Zippers are *one-hole contexts* for their parent type. The *focus
|
||
|
point* is the hole.
|
||
|
|
||
|
Zippers can also be represented by *delimited continuations* of a
|
||
|
traversing process. These can be used with any `Traversable` without
|
||
|
knowing its concrete type at all!
|
||
|
|
||
|
Continuation-capturing traversal functions invert the control
|
||
|
structure in a similar manner to how taking the derivative of the data
|
||
|
type inverts its data structure.
|
||
|
|
||
|
## 2-3 Finger Trees
|
||
|
|
||
|
### Not this time
|
||
|
|
||
|
I lied. There's no time to talk about these. Sorry!
|
||
|
|
||
|
But you should be prepared to read this paper about them now:
|
||
|
|
||
|
http://www.cs.ox.ac.uk/people/ralf.hinze/publications/FingerTrees.pdf
|
||
|
|
||
|
## Hash Array Mapped Tries
|
||
|
|
||
|
### Downsides To This Stuff
|
||
|
|
||
|
Way back at the beginning, I mentioned some downsides:
|
||
|
|
||
|
* Extra overhead for references vs. in-place storage
|
||
|
* Locality of data can suffer
|
||
|
|
||
|
These are constant-factor issues, but with the advances in the cache
|
||
|
structure of machines, they are an increasingly *large* constant
|
||
|
factor.
|
||
|
|
||
|
Bad memory access patterns can make memory fetches take *orders of
|
||
|
magnitude* longer.
|
||
|
|
||
|
The solution is to use *bigger* chunks of data; do more copying and
|
||
|
less pointer chasing.
|
||
|
|
||
|
### Hash Array Mapped Trie
|
||
|
|
||
|
Key features:
|
||
|
|
||
|
* Instead of a branching factor of 2, use a factor of 32 or so.
|
||
|
* Each node (aside from leaf nodes) contains up to 32 elements.
|
||
|
* $n$ bits of the key index the array at the next level
|
||
|
* Bit population count is used to represent sparse arrays
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
type Key = Word
|
||
|
type Bitmap = Word
|
||
|
data HAMT a = Empty
|
||
|
| BitmapIndexed !Bitmap !(Vector (HAMT a))
|
||
|
| Leaf !Key a
|
||
|
| Full !(Vector (HAMT a))
|
||
|
~~~
|
||
|
|
||
|
### Hash Array Mapped Trie - More Explanation
|
||
|
|
||
|
A *trie* is a tree with a branching factor $k$, where $k$ is the size
|
||
|
of alphabet of key elements.
|
||
|
|
||
|
A trie for caseless English words would have branching factor 26.
|
||
|
|
||
|
The first symbol of the key determines the index to the first tree
|
||
|
level; second symbol to the next, etc.
|
||
|
|
||
|
HAMT divides a key (vector index or hashed key) into bit fields:
|
||
|
|
||
|
* a branching factor of 32 uses 5-bit fields
|
||
|
* A 32-bit key divides into 6 fields, with 2 bits remaining
|
||
|
|
||
|
HAMT stores children sparsely: Bitmap determines child presence,
|
||
|
vector stores only present children.
|
||
|
|
||
|
Clever bit-level math determines child vector index from bitmap.
|
||
|
|
||
|
### Hash Array Mapped Trie - Lookup
|
||
|
|
||
|
~~~ {.Haskell}
|
||
|
maskIdx b m = popCount (b .&. (m - 1))
|
||
|
mask k s = shiftL 1 (subkey k s)
|
||
|
|
||
|
subkeyMask = 1 `shiftL` bitsPerSubkey - 1
|
||
|
subkey k s = fromIntegral $ shiftR k s .&. subkeyMask
|
||
|
|
||
|
lookup :: Key -> HAMT a -> Maybe a
|
||
|
lookup k t = lookup' k 0 t
|
||
|
lookup' k s t
|
||
|
= case t of
|
||
|
Empty -> Nothing
|
||
|
Leaf kx x
|
||
|
| k == kx -> Just x
|
||
|
| otherwise -> Nothing
|
||
|
BitmapIndexed b v ->
|
||
|
let m = mask k s in
|
||
|
if b .&. m == 0
|
||
|
then Nothing
|
||
|
else lookup' k (s+subkeyWidth) (v ! maskIdx b m)
|
||
|
Full v -> lookup' k (s+subkeyWidth) (v ! subkey k s)
|
||
|
~~~
|
||
|
|
||
|
# Conclusion
|
||
|
|
||
|
### Functional Data Structure Fundamentals
|
||
|
|
||
|
Functional data structures provide:
|
||
|
|
||
|
* Safety via Immutability
|
||
|
* Efficient Persistence
|
||
|
* Clear & Simple Implementation
|
||
|
|
||
|
Many are based around some core ideas:
|
||
|
|
||
|
* Numerical Representations
|
||
|
* Lazy Amortization
|
||
|
* Non-Uniform Type Structure
|
||
|
|
||
|
The field is still developing!
|
||
|
|
||
|
Studying Functional Data Structures will teach you a lot about
|
||
|
functional languages.
|