Infinite self-referencing list

Question

Problem

I'm trying to implement the modified Dragon Curve from AoC Day 16 as an infinite list in Haskell.

The list is composed of True and False. We start with some list s0:

s1 = s0 ++ [False] ++ (map not . reverse) s0
s2 = s1 ++ [False] ++ (map not . reverse) s1
s3 = s2 ++ [False] ++ (map not . reverse) s2

Generally

sn = s(n-1) ++ [0] ++ (map not . reverse) s(n-1) 
   = s0 ++ [0] ++ (f s0) ++ [0] ++ (f (s0 ++ [0] ++ (f s0))) ++ ...
      where f = (map not . reverse)

Attempted Implementation

I can get sn quite easily using the iterate function.

modifiedDragonCurve :: [Bool] -> Int -> [Bool]
modifiedDragonCurve s n = (iterate f s)!!n
    where f s     = s ++ [False] ++ (map not . reverse) s

This gives me a list [s0, s1, s2, ...]. However, since s(n-1) is a prefix of sn this could be built as an infinite list, but i cannot figure out how to approach it. I think I need something along the lines of

modifiedDragonCurve :: [Bool] -> [Bool]
modifiedDragonCurve s = s ++ [False] ++ (map not . reverse) listSoFar

But cannot figure out how to refer to the already generated list (listSoFar).

Any suggestions would be greatly appreciated.

@freestyle: I want to create a function `[Bool] -> [Bool]` that (given initial list) generates an infinite list. Essentially, I want to implement the sequence `sn = s(n-1) ++ [0] ++ (map not . reverse) s(n-1)` — Bart Platak, Dec 16 '16 at 11:49
So, limit (sn) where n -> ∞? sn, you already implemented, it's `(iterate f s)!!n` — freestyle, Dec 16 '16 at 12:03

Daniel Wagner · Accepted Answer · 2016-12-17T00:01:28.950

I played with this myself while solving the AoC problem, too. I found a remarkable solution that does not require reverse, hence is more memory-friendly and speedy than the other solutions listed here. It's also beautiful! The dragon curve itself is a nice short two-liner:

merge (x:xs) ys = x:merge ys xs
dragon = merge (cycle [False, True]) dragon

It can be extended to use a "seed" as the AoC problem demands just by alternating between the seed and the bits of the true dragon curve:

infinite bs = go bs (map not (reverse bs)) dragon where
    go bs bs' (d:ds) = bs ++ [d] ++ go bs' bs ds

(This does call reverse once -- but unlike other solutions, it is called just once on a chunk of data about the size of the input, and not repeatedly on chunks of data about as large as the part of the list you consume.) Some timings to justify my claims; all versions used to produce 2^25 elements with an empty seed, compiled with ghc -O2, and timed with /usr/bin/time.

freestyle's solution takes 11.64s, ~1.8Gb max resident
David Fletcher's solution takes 10.71s, ~2Gb max resident
luqui's solution takes 9.93s, ~1GB max resident
my solution takes 8.87s, ~760MB max resident

The full test program was

main = mapM_ print . take (2^25) . dragon $ []

with dragon replaced by each implementation in turn. A carefully crafted consumer can lower memory usage even further: my best solution to the second-star problem so far runs in 5Mb real residency (i.e. including all the space GHC allocated from the OS for its multiple generations, slack space, and other RTS overhead), 60Kb GHC-reported residency (i.e. just the space used by not-yet-GC'd objects, regardless of how much space GHC has allocated from the OS).

For raw speed, though, you can't beat an unboxed mutable vector of Bool: a coworker reports that his program using such ran in 0.2s, using about 35Mb memory to store the complete expanded (but not infinite!) vector.

Wow that's lovely and surprising – luqui Dec 16 '16 at 22:41 — luqui, Dec 16 '16 at 22:41

David Fletcher · Answer 2 · 2016-12-16T12:35:53.547

2

Here's one way. We make a list not of the s0, s1 etc but of only the new part at each step, then we can just concat them together.

dragonCurve :: [Bool]
dragonCurve = concatMap f [0..]
  where
    f n = False : (map not . reverse) (take (2^n-1) dragonCurve)

(This assumes s0 = []. If it can be something else you'll have to modify the length calculation.)

I can't think of a nice way to both be self-referential and not deal with prefix lengths. Here's a non-self-referencing solution, still using the idea of making a list of non-overlapping parts.

dragonCurve' :: [Bool]
dragonCurve' = concat (unfoldr f [])
  where
    f soFar = Just (part, soFar ++ part)
      where
        part = False : (map not . reverse) soFar

edited Dec 16 '16 at 12:35

answered Dec 16 '16 at 12:23

David Fletcher

2,590
1
12
14

Thanks, that's exactly what I was looking for. Surprisingly, both of those solutions require _more_ memory than my original iterate. – Bart Platak Dec 16 '16 at 14:07
1

@BartPlatak, One reason it might take more memory is that `dragonCurve` is now a CAF, so you're remembering the `dragonCurve` list forever rather than streaming and cleaning up after yourself. Though it should be asymptotically the same because `reverse` needs linear memory already, and you're reversing at least half of the list. – luqui Dec 16 '16 at 14:43
Makes sense @luqui. Bearing in mind that I was operating on 35 million values it was still surprisingly quick. I was also using `chunksOf` later on which isn't very efficient either... – Bart Platak Dec 16 '16 at 15:47
@BartPlatak, there's nothing terribly wrong with `chunksOf`. It's not as interesting as the version I put in `Data.Sequence`, but it's around as fast as it can be. – dfeuer Dec 16 '16 at 18:32
@dfeuer: The rest of the problem involves processing non-overlapping pairs generated from the dragon curve. I have found my computation time reduce from 35s to 9s when I replaced `chunksOf` with a pattern match - not too sure why. If you're interested, the diff can be found [here](https://github.com/Norfavrell/adventofcode-16/commit/433e809a2d5002224da0516e60b780a1696f055e). – Bart Platak Dec 16 '16 at 18:40
1

Oh, for such tiny chunks, I guess you probably save a lot by not allocating the chunks at all. `chunksOf` doesn't really make sense in that context. – dfeuer Dec 16 '16 at 18:43

luqui · Answer 3 · 2016-12-16T16:57:39.880

2

You totally nerd sniped me with this. It's not a self-referencing list, but I did manage to come up with a "wasteless" solution -- one where we're not dropping or forgetting about anything we've computed.

dragon :: [Bool] -> [Bool]
dragon = \s -> s ++ gen [] s
    where
    inv = map not . reverse
    gen a b =
        let b' = inv b
            r = b' ++ a
        in False:r ++ gen r (False:r)

gen a b accepts as input all the data of the current sequence, such that the current sequence is inv a ++ b. Then we generate the remainder in r, output it and recursively continue generating the remainder. I accept a inverted because then all I need to do is prepend b' at each step (which does not even examine a), and we don't need to reverse more than we have to.

Being nerdsniped I investigated quite a number of other data structures, imagining that a linked list is probably not the best fit for this problem, including DList, Data.Sequence (finger trees), free monoid (which ought to be good at being reversed), and a custom tree that cancels reverses. To my surprise, list still performed the best of all of these and I'm still bewildered by that. In case you are curious, here is my code.

edited Dec 16 '16 at 16:57

answered Dec 16 '16 at 16:51

luqui

59,485
12
145
204

I'm curious as to why the lists performed best. Did you compile with optimizations (ie. could it be fusion related?). – Alec Dec 16 '16 at 17:45
If you're after performance, I recommend manually fusing the `map not` with `reverse`. `revComp = go [] where go acc [] = acc; go acc (x : xs) = let !x' = not x in go (x' : acc) xs`. A more substantial transformation likely worth trying: represent 64 elements per unpacked word in `data WordList = Cons !Word WordList`. You can then waste a whole bunch of time working out the fastest bit hacks. – dfeuer Dec 16 '16 at 18:38
Thanks @luqui, it's a very neat (and quick!) solution. I'll have to look through your code tomorrow to better comprehend the different data structures and get back to you. – Bart Platak Dec 16 '16 at 18:48
@dfeuer, I wasn't interested in raw performance as much as what kind of data structure/sequence abstraction would perform well. I did compile with optimizations, didn't realize that it could be related to fusion. – luqui Dec 16 '16 at 20:51

freestyle · Answer 4 · 2016-12-16T17:53:14.107

1

For example:

dragon s0 = s0 ++ concatMap ((False :) . inv) seq
  where
    inv = map not . reverse
    seq = iterate (\s -> s ++ False : inv s) s0

edited Dec 16 '16 at 17:53

answered Dec 16 '16 at 12:39

freestyle

3,692
11
21

This is not a correct solution: it produces an extra copy of the first component at each recursive level. – luqui Dec 16 '16 at 16:38
@luqui This implementation produces the sequence: `s0 ++ [False] ++ inv s0 ++ [False] ++ inv s1 ++ [False] ++ inv s2 ++...`. It's what we want. So, what is not correct? – freestyle Dec 16 '16 at 17:40

Infinite self-referencing list

Problem

Attempted Implementation

4 Answers4

Linked