Can I transduce / educe over multiple collections without concatting them

Question

Below is a bare-bones version of what I'm doing:

(eduction (map inc) (concat [1 2] [3 4]))
; -> (2 3 4 5)

Is there a way to get the same eduction, without having to pay the cost of concat, which creates an intermediate lazy seq?

The following would perhaps already be a bit less wasty, as instead of the lazy seq, we just have a vector, but I wonder if even that can be avoided.

(eduction (comp cat (map inc)) [[1 2] [3 4]])

There is almost zero cost to the lazy seq solution using `concat`, so I'm not sure why you don't just do that? — Alan Thompson, Mar 02 '18 at 02:13
@AlanThompson `(not= almost-zero zero)` There are cases where you *need* to squeeze the last bit of performance out of your code. I'm in such a situation, where this code is in the hot path. Also, I'm simply curious. And although transducers are in my opinion mainly about DRY and decomplecting code, another aspect of them is superior performance. — Evgeniy Berezovsky, Mar 02 '18 at 03:17
I tend to agree with @AlanThompson here. If you find a more performant solution, could you please edit your post with https://github.com/hugoduncan/criterium timings comparing the two ? — marco.m, Mar 02 '18 at 06:55

score 4 · Accepted Answer · answered Mar 02 '18 at 09:53

4

It may be simplest to process your collections separately and combine the results. There is, in fact, an easy reducers-based solution that does exactly that under the covers.

The clojure.core.reducers namespace has cat, a combining function for fold, that you can repurpose to construct a reducible concatenation of your vectors.

(require '[clojure.core.reducers :as r])

(eduction (map inc) (r/cat [1 2] [3 4]))
;; => (2 3 4 5)

This avoids the lazy sequence used in concat. If you have more than two vectors, you can concatenate them all with (reduce r/cat [] colls) or similar.

This approach did speed up some of the experiments I did, though not your particular example.

answered Mar 02 '18 at 09:53

glts

21,808
12
73
94

This goes to show that chunked (lazy) seqs are often fast enough, and the performance ‘improvement’ afforded by reducibles may not be there at all. Measure! – glts Mar 02 '18 at 10:11
Thanks glts. I also timed this against small and large input collections, with the result that performance-wise it does not matter. Still interesting to know this function is there in core... – Evgeniy Berezovsky Mar 04 '18 at 22:04

Caleb Macdonald Black · Answer 2 · 2018-12-09T22:49:26.597

0

You can also do this without the reducer just using the built in cat transducer

(eduction (comp cat (map inc)) [[1 2] [3 4]])
;; => (2 3 4 5)

edited Dec 09 '18 at 22:49

answered Dec 09 '18 at 07:47

Caleb Macdonald Black

1,494
12
17

1

If you read my question to the end, you'll find your suggestion in it - verbatim. – Evgeniy Berezovsky Dec 09 '18 at 21:43

Can I transduce / educe over multiple collections without concatting them

2 Answers2