4

I'm having a problem stringing some forms together to do some ETL on a result set from a korma function.

I get back from korma sql:

({:id 1 :some_field "asd" :children [{:a 1 :b 2 :c 3} {:a 1 :b 3 :c 4} {:a 2 :b 2 :c 3}] :another_field "qwe"})

I'm looking to filter this result set by getting the "children" where the :a keyword is 1.

My attempt:

;mock of korma result
(def data '({:id 1 :some_field "asd" :children [{:a 1 :b 2 :c 3} {:a 1 :b 3 :c 4} {:a 2 :b 2 :c 3}] :another_field "qwe"}))

(-> data 
    first 
    :children 
    (filter #(= (% :a) 1)))

What I'm expecting here is a vector of hashmaps that :a is set to 1, i.e :

[{:a 1 :b 2 :c 3} {:a 1 :b 3 :c 4}]

However, I'm getting the following error:

IllegalArgumentException Don't know how to create ISeq from: xxx.core$eval3145$fn__3146  clojure.lang.RT.seqFrom (RT.java:505)

From the error I gather it's trying to create a sequence from a function...though just not able to connect the dots as to why.

Further, if I separate the filter function entirely by doing the following:

(let [children (-> data first :children)] 
    (filter #(= (% :a) 1) children))

it works. I'm not sure why the first-thread is not applying the filter function, passing in the :children vector as the coll argument.

Any and all help much appreciated.

Thanks

Alan Thompson
  • 29,276
  • 6
  • 41
  • 48
OnResolve
  • 4,016
  • 3
  • 28
  • 50

3 Answers3

8

You want the thread-last macro:

(->> data first :children (filter #(= (% :a) 1)))

yields

({:a 1, :b 2, :c 3} {:a 1, :b 3, :c 4})

The thread-first macro in your original code is equivalent to writing:

(filter (:children (first data)) #(= (% :a) 1))

Which results in an error, because your anonymous function is not a sequence.

Diego Basch
  • 12,764
  • 2
  • 29
  • 24
3

The thread-first (->) and thread-last (->>) macros are always problematical in that it is easy to make a mistake in choosing one over the other (or in mixing them up as you have done here). Break down the steps like so:

(ns tstclj.core
  (:use cooljure.core)  ; see https://github.com/cloojure/tupelo/
  (:gen-class))

(def data [ {:id 1 :some_field "asd" 
             :children [ {:a 1 :b 2 :c 3} 
                          {:a 1 :b 3 :c 4}
                          {:a 2 :b 2 :c 3} ] 
             :another_field "qwe"} ] )

(def v1    (first data))
(def v2    (:children v1))
(def v3    (filter #(= (% :a) 1) v2))

(spyx v1)    ; from tupelo.core/spyx
(spyx v2)
(spyx v3)

You will get results like:

v1 => {:children [{:c 3, :b 2, :a 1} {:c 4, :b 3, :a 1} {:c 3, :b 2, :a 2}], :another_field "qwe", :id 1, :some_field "asd"}
v2 => [{:c 3, :b 2, :a 1} {:c 4, :b 3, :a 1} {:c 3, :b 2, :a 2}]
v3 => ({:c 3, :b 2, :a 1} {:c 4, :b 3, :a 1})

which is what you desired. The problem is that you really needed to use thread-last for the filter form. The most reliable way of avoiding this problem is to always be explicit and use the Clojure as-> threading form, or, even better, it-> from the Tupelo library:

(def result (it-> data 
                  (first it)
                  (:children  it)
                  (filter #(= (% :a) 1) it)))

By using thread-first, you accidentally wrote the equivalent of this:

(def result (it-> data 
                  (first it)
                  (:children  it)
                  (filter it #(= (% :a) 1))))

and the error reflects the fact that the function #(= (% :a) 1) can't be cast into a seq. Sometimes, it pays to use a let form and give names to the intermediate results:

(let [result-map        (first data)
      children-vec      (:children  result-map)
      a1-maps           (filter #(= (% :a) 1) children-vec) ]
  (spyx a1-maps))
;;-> a1-maps => ({:c 3, :b 2, :a 1} {:c 4, :b 3, :a 1})

We could also look at either of the two previous solutions and notice that the output of each stage is used as the last argument to the next function in the pipeline. Thus, we could also solve it with thread-last:

(def result3  (->>  data
                    first
                    :children
                    (filter #(= (% :a) 1))))
(spyx result3)
;;-> result3 => ({:c 3, :b 2, :a 1} {:c 4, :b 3, :a 1})

Unless your processing chain is very simple, I find it is just about always clearer to use the it-> form to be explicit about how the intermediate value should be used by each stage of the pipeline.

Alan Thompson
  • 29,276
  • 6
  • 41
  • 48
1

I'm not sure why the first-thread is not applying the filter function, passing in the :children vector as the coll argument.

This is precisely what the thread-first macro does.

From the clojuredocs.org:

Threads the expr through the forms. Inserts x as the second item in the first form, making a list of it if it is not a list already.

So, in your case the application of filter ends up being:

(filter [...] #(= (% :a) 1))

If you must use thread-first (instead of thread-last), then you can get around this by partially applying filter and its predicate:

(->
  data
  first
  :children
  ((partial filter #(= (:a %) 1)))
  vec)

; [{:a 1, :b 2, :c 3} {:a 1, :b 3, :c 4}]
pdoherty926
  • 9,895
  • 4
  • 37
  • 68