10

In many systems, head.reverse requires space proportional to the size of the list, whereas last requires constant space.

Are there systems to perform such a transformation? Similarly for reverse.take n.reverse?

Edit: I would like to extend my question: I am not after a concrete transformation — I am rather after any optimization to this end.

false
  • 10,264
  • 13
  • 101
  • 209
  • Not sure what you're after. A rewrite rule could do it, if that is sufficiently close to what you want. – Daniel Fischer Mar 17 '13 at 23:34
  • @DanielFischer: I am rather interested in a general method to resolve this. Think of the second example. – false Mar 17 '13 at 23:36
  • if this still interests you, the answer depends on how many consumers the list has; if `last` is the only one, it should run in constant space and O(n) time; but if some other consumer holds a reference to this list, it will come into existence whole when `last` enumerates over it to its last cell. Thus O(n) space and time. Similarly for the `takeLast` shown in Daniel Wagner's answer. -- Or we can change the *actual implementation* of lists, as self-balancing trees with index used as key, with obvious consequences. Clojure uses even cleverer trees with high branching factor (32?). – Will Ness Nov 12 '13 at 21:30
  • @WillNess: Indeed I assumed that there is no other consumer for the list. Seems there is no satisfactory way out of this space leak. – false Nov 12 '13 at 21:36
  • why, no, if there isn't any other consumer than there's no leak. (?) – Will Ness Nov 12 '13 at 21:44
  • @WillNess: During the computation, space requirements are O(n) whereas with `last` they are O(1) – false Nov 12 '13 at 21:49
  • maybe I misunderstand you (which computation do you mean?); but even computing `takeLast k xs` should take O(1) space (with optimizations turned on of course -O2). The consumer of *its* result will determine the next part's size requirement. e.g. `last (takeLast 5 xs)` is O(1) space overall. (again, if this is the only statement in the program concerning `xs`, i.e. there are no other consumers which hold on to some other part in it). -- clarification: `takeLast 5 xs` is not a computation; it is a definition. Only `main` describes the overall computation – Will Ness Nov 12 '13 at 21:57
  • @WillNess: `takeLast k xs` requires at best O(k) space. – false Nov 12 '13 at 22:00
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/41055/discussion-between-will-ness-and-false) – Will Ness Nov 12 '13 at 22:00

3 Answers3

5

You can transform reverse . take n . reverse by treating your list as a particularly obtuse lazy natural number: empty lists are zero, and conses are succ. For lazy naturals encoded as lists, subtraction is drop:

type LazyNat a = [a]

lengthLazy :: [a] -> LazyNat a
lengthLazy = id

dropLazy :: LazyNat a -> [b] -> [b]
dropLazy [] xs = xs
dropLazy (_:n) (_:xs) = dropLazy n xs
dropLazy _ _ = []

-- like Prelude.subtract, this is flip (-)
subtractLazy :: Int -> LazyNat a -> LazyNat a
subtractLazy = drop

Now we can easily implement the "take last n" function:

takeLast n xs = dropLazy (subtractLazy n (lengthLazy xs)) xs

...and you'll be pleased to know that only n conses need to be in memory at any given time. In particular, takeLast 1 (or indeed takeLast N for any literal N) can run in constant memory. You can verify this by comparing what happens when you run takeLast 5 [1..] with what happens when you run (reverse . take 5 . reverse) [1..] in ghci.

Of course, I've tried to use very suggestive names above, but in a real implementation you might inline all the nonsense above:

takeLast n xs = go xs (drop n xs) where
    go lastn  []    = lastn
    go (_:xs) (_:n) = go xs n
    go _      _     = []
Daniel Wagner
  • 145,880
  • 9
  • 220
  • 380
1

You can write a simple rewrite rule for this.

http://www.haskell.org/haskellwiki/Playing_by_the_rules

Fusion rules may catch it, too, depending how reverse is encoded.

Don Stewart
  • 137,316
  • 36
  • 365
  • 468
1

If we compare drop and last

>>> last [1..10^5]
100000
(0.01 secs, 10496736 bytes)
>>> last [1..10^6]
1000000
(0.05 secs, 81968856 bytes)
>>> last [1..10^7]
10000000
(0.32 secs, 802137928 bytes)


>>> drop (10^5-1) [1..10^5]
[100000]
(0.01 secs, 10483312 bytes)
>>> drop (10^6-1) [1..10^6]
[1000000]
(0.05 secs, 82525384 bytes)
>>> drop (10^7-1) [1..10^7]
[10000000]
(0.32 secs, 802142096 bytes)

We obtain similar performance in space and time, I must admit that I cheated a little bit because here we don't need to calculate the length of the list. Anyway I believe It shouldn't be an issue in space. Then your reverse . take n . reverse could be expressed using drop and length.


As side note I've tested other workaround and the result are bad.

takeLastN = foldl' (.) id . flip replicate tail 

lastN = foldl' (.) last . flip replicate init
zurgl
  • 1,930
  • 1
  • 14
  • 20
  • 1
    Calculating the length will effectively require space proportional to the length, since the list can only then be used by drop. Right? – false Mar 18 '13 at 02:07
  • Not so much as reverse cause when you calculate the length of the list you just need to accumulate the result (add one) thus only one "slot" of memory is required during all the process, contrary as reverse where the amount of memory is growing more and more as we traverse the list we construct a new list of the same length. – zurgl Mar 18 '13 at 02:12
  • 1
    Anyway, as more generally you speak about "system", I read somewhere than for a special encoding of the list data type we can include the length of the list by construction, then when you ask for the length of the list the answer is return in constant time and space. The list carry this extra information. But I guess it rely on depend type and I'm not sure if we can do it in haskell, as is. – zurgl Mar 18 '13 at 02:20
  • But the list gets expanded from `[1..10^7]` to something O(10^7) – false Mar 18 '13 at 02:20
  • 1
    I do agree, which why I said "It shouldn't be an issue in space", but in time, there is an overhead for sure :/ – zurgl Mar 18 '13 at 02:23
  • 1
    I downvoted. The reason `reverse` is bad is because it forces the whole list into memory at once, and your proposal to run `length` and then `drop` does the same thing. – Daniel Wagner Mar 18 '13 at 04:46