Clojure 1.1 introduced "chunked" sequences,
This can provide greater efficiency ... Consumption of chunked-seqs as
normal seqs should be completely transparent. However, note that some
sequence processing will occur up to 32 elements at a time. This could
matter to you if you are relying on full laziness to preclude the
generation of any non-consumed results. [Section 2.3 of "Changes to Clojure in Version 1.1"]
In your example (range)
seems to be producing a seq that realizes one element at a time and (range 999)
is producing a chunked seq. map
will consume a chunked seq a chunk at a time, producing a chunked seq. So when take asks for the first element of the chunked seq, function passed to map is called 32 times on the values 0 through 31.
I believe it is wisest to code in such a way the code will still work for any seq producing function/arity if that function produces a chunked seq with an arbitrarily large chunk.
I do not know if one writes a seq producing function that is not chunked if one can rely in current and future versions of library functions like map and filter to not convert the seq into a chunked seq.
But, why the difference? What are the implementation details such that (range)
and (range 999)
are different in the sort of seq produced?
- Range is implemented in clojure.core.
(range)
is defined as (iterate inc' 0)
.
- Ultimately iterate's functionality is provided by the Iterate class in Iterate.java.
(range end)
is defined, when end is a long, as (clojure.lang.LongRange/create end)
- The LongRange class lives in LongRange.java.
Looking at the two java files it can be seen that the LongRange class implements IChunkedSeq
and the Iterator class does not. (Exercise left for the reader.)
Speculation
- The implementation of clojure.lang.Iterator does not chunk because iterator can be given a function of arbitrary complexity and the efficiency from chunking can easily be overwhelmed by computing more values than needed.
- The implementation of
(range)
relies on iterator instead of a custom optimized Java class that does chunking because the (range)
case is not believed to be common enough to warrant optimization.