I believe I've figured out a way. The number system I suggested in the question turns out not to be the right one; it doesn't support O(log n)
pop
(or at least does not do so simply). We can patch this up by switching from 0/1/2 redundant binary to 1/2/3 redundant binary.
-- Note the lazy field in the Two constructor.
data Stack a
= Empty
| One !(SmallArray a) !(Stack a)
| Two !(SmallArray a) !(SmallArray a) (Stack a)
| Three !(SmallArray a) !(SmallArray a) !(SmallArray a) !(Stack a)
push :: a -> Stack a -> Stack a
push = pushA . pure
pushA :: SmallArray a -> Stack a -> Stack a
pushA sa Empty = One sa Empty
pushA sa1 (One sa2 more) = Two sa1 sa2 more
pushA sa1 (Two sa2 sa3 more) = Three sa1 sa2 sa3 more
pushA sa1 (Three sa2 sa3 sa4 more) = Two sa1 sa2 (pushA (sa3 <> sa4) more)
pop :: Stack a -> Maybe (a, Stack a)
pop stk = do
ConsA sa stk' <- pure $ popA stk
hd <- indexSmallArrayM sa 0
Just (hd, stk')
data ViewA a = EmptyA | ConsA !(SmallArray a) (Stack a)
popA :: Stack a -> ViewA a
popA Empty = EmptyA
popA (Three sa1 sa2 sa3 more) = ConsA sa1 (Two sa2 sa3 more)
popA (Two sa1 sa2 more) = ConsA sa1 (One sa2 more)
popA (One sa more) = ConsA sa $
case popA more of
EmptyA -> Empty
ConsA sa1 more' -> Two sa2 sa3 more'
where
len' = sizeofSmallArray sa1 `quot` 2
sa2 = cloneSmallArray sa1 0 len'
sa3 = cloneSmallArray sa1 len' len'
The first important step in proving this has the desired amortized bounds is to choose a debit invariant[*]. This had me stuck for quite a while, but I think I've got it.
Debit invariant: We allow the lazy Stack
in a Two
node as many debits as there are elements stored in that and all earlier Two
nodes.
Theorem
push
and pop
run in O(log n)
amortized time.
Proof sketch
Push
We consider each of the cases in turn.
Empty
is always trivial.
One
: We increase the debit allowance below.
Two
: We reduce the debit allowance of nodes below by 1 unit. We pay O(log n)
to discharge the excess debits.
Three
: This is the tricky case for push
. We have some number of Three
nodes followed by something else. For each Three
node, we suspend s
array-doubling work. We pay for that using the additional debit allowance we gain from the elements in the new Two
node. When we reach the end of the Three
chain, we need to do something a bit funny. We may need the full debit allowance below, so we use debit passing to spread the debits for the final array append across all the earlier nodes.
At the end, we have either Empty
, One
, or Two
. If we have Empty
or One
, we're done. If we have Two
, then changing that to Three
reduces the debit allowance below. But we also gain debit allowance below, from all the Three
s that have changed to Two
s! Our net loss debit allowance is just 1, so we're golden.
Pop
We again proceed by cases.
Empty
is trivial.
Three
: we increase the debit allowance below.
Two
: We reduce the debit allowance on certain nodes by 1 unit; pay O(log n) to discharge the excess debits.
One
: This is the hard case. We have some number of One
nodes followed by something else. For each One
, we perform a split. We place debits to pay for those, discharging the ones at the root. At the end, we have a situation similar to that for push
: the tricky case is ending in Two
, where we use the fact that all the new Two
s pay for the loss of the final Two
.
Compactness
One might worry that enough thunks could accumulate in the structure to negate the compactness of the array-based representation. Fortunately, this is not the case. A thunk can appear only on the Stack
in a Two
node. But any operation on that node will turn it into a One
or a Three
, forcing the Stack
. So thunks can never accumulate in chains, and we never have more than one thunk per node.
[*] Okasaki, C. (1998). Purely Functional Data Structures. Cambridge: Cambridge University Press. doi:10.1017/CBO9780511530104, or read the relevant parts of his thesis online.