6

When processing each element in a seq I normally use first and rest. However these will cause a lazy-seq to lose its "laziness" by calling seq on the argument. My solution has been to use (first (take 1 coll)) and (drop 1 coll) in their place when working with lazy-seqs, and while I think drop 1 is just fine, I don't particularly like having to call first and take to get the first element.

Is there a more idiomatic way to do this?

robertjlooby
  • 7,160
  • 2
  • 33
  • 45

1 Answers1

10

The docstrings for first and rest say that these functions call seq on their arguments to convey the idea that you don't have to call seq yourself when passing in a seqable collection which is not in itself a seq, like, say, a vector or set. For example,

(first [1 2 3])
;= 1

would not work if first didn't call seq on its argument; you'd have to say

(first (seq [1 2 3]))

instead, which would be inconvenient.

Both take and drop also call seq on their arguments, otherwise you couldn't call them on vectors and the like as explained above. In fact this is true of all standard seq collections -- those which do not call seq directly are built upon lower-level components which do.

In no way does this impair the laziness of lazy seqs. The forcing / realization which happens as a result of a first / rest call is the smallest amount possible to obtain the requested result. (How much that is depends on the type of the argument; if it is not in fact lazy, there is no extra realization involved in the first call; if it is partly lazy -- that is, chunked -- there will be some extra realization (up to 32 initial elements will be computed at once); if it's fully lazy, only the first element will be computed.)

Clearly first, when passed a lazy seq, must force the realization of its first element -- that's the whole point. rest is actually somewhat lazy in that it actually doesn't force the realization of the "rest" part of the seq (that's in contrast to next, which is basically equivalent to (seq (rest ...))). The fact that it does force the first element to be realized so that it can skip over it immediately is a conscious design choice which avoids unnecessary layering of lazy seq objects and holding the head of the original seq; you could say something like (lazy-seq (rest xs)) to defer even this initial realization, at the cost of holding on to xs until realized the lazy seq wrapper is realized.

Michał Marczyk
  • 83,634
  • 13
  • 201
  • 212
  • Thanks for that. I think it is the difference between the chunked vs. fully lazy that was throwing me off. In particular when I was first working with lazy-seq I think whatever I was working on must have caused problems when chunked so I just stuck with it after that. – robertjlooby Aug 06 '13 at 03:02
  • Yes, chunking can lead to unexpected results if one forgets about it. `take` uses both `first` and `rest` in its implementation, while `drop` is built on `rest`, so they don't help. Unchunking is possible, but only in the sense that one can cause transformations layered on top of chunked seqs to be evaluated one element at a time; the underlying chunked seq will always realize its chunks in full. (The way to do it is to wrap the chunked seq in an unchunking seq, perhaps produced with `(reify clojure.lang.ISeq ...)` with the original seq in a closure.) – Michał Marczyk Aug 06 '13 at 03:10
  • Are you sure `take`/`drop` also actually call `seq`? Just trying it out in the repl and calling `class` with `(range)`, `(take 1 (range))`, `(drop 1 (range))` all give `clojure.lang.LazySeq`. Calling with `(rest (range))`, `(seq (range))` gives `clojure.lang.ChunkedCons`. – robertjlooby Aug 06 '13 at 03:12
  • `take` and `drop` wrap their bodies in `lazy-seq`, so they hold on to the entire input collection until you actually realize this wrapper seq. Once you do realize it, the resulting seq is produced by code involving an explicit call to `seq` on the input collection (for reasons explained in the answer), as well as calls to `first` and `rest`. In fact in the previous comment I was going to mention the calls to `seq` in the second sentence, but somehow I forgot and now I can't edit... Oh well, I did in the answer. See `(source take)` and `(source drop)` for details. – Michał Marczyk Aug 06 '13 at 03:15
  • So actually `(drop 1 xs)` is pretty much equivalent to `(lazy-seq (rest xs))`, with the same associated costs. See this SO question from April and my answer over there for an example of possible undesirable results of using this sort of wrapper (of course it's just something to keep in mind, not a reason to avoid using such wrappers when they are genuinely useful): [Clojure head retention](http://stackoverflow.com/questions/15994316/clojure-head-retention). – Michał Marczyk Aug 06 '13 at 03:24