0

From a Clojure Elasticsearch aggregation query, I have a buckets vector of maps like the following where the key is a numerical identifier and the doc_count is the number of occurrences.

:buckets [{:key 14768496, :doc_count 464} {:key 14761312, :doc_count 440} {:key 14764921, :doc_count 412}]

Given a value like 14768496 I would like to be able to retrieve the doc_count, here 464.

Andreas Guther
  • 422
  • 4
  • 7

3 Answers3

3

I provided some feedback on the OP's own answer but figured it was worth providing as an answer in its own right:

user> (def buckets [{:key 14768496, :doc_count 464} {:key 14761312, :doc_count 440} {:key 14764921, :doc_count 412}])
#'user/buckets
user=> (def accounts (into {} (map (juxt :key :doc_count)) buckets))
#'user/accounts

This uses the transducer-producing arity of map as the "xform" argument to into so it avoids creating any intermediate lazy sequences.

You could also do (into {} (map (juxt :key :doc_count) buckets)) which will produce a lazy sequence of vector pairs (of the key and the document count), and then "pour" it into an empty hash map.

juxt returns a function of one parameter that produces a vector from the application of each argument (the functions passed to juxt) to that parameter:

user=> ((juxt inc dec) 42)
[43 41]
Sean Corfield
  • 6,297
  • 22
  • 31
  • 1
    We could really use a canonical question for "If you want to look something up by key in a list of maps, you should instead arrange to have a single map". – amalloy Mar 01 '22 at 00:40
  • Makes sense. As it looks to me, it is not possible to improve the question after it was posted. – Andreas Guther Mar 08 '22 at 00:54
0

While crafting the question I came across the following solution which I now want to share since it took me a while to find the right approach.

(def accounts (apply hash-map
                      (mapcat
                        #(vector (% :key) (% :doc_count))
                        buckets)))

This produces the following map:

{14768496 464, 14761312 440, 14764921 412}

Now the retrieval is straightforward:

(println (accounts 14768496))
Andreas Guther
  • 422
  • 4
  • 7
  • 1
    #(vector (% :key) (% :doc_count)) can be replaced with (juxt :key :doc_count) In addition, instead of (apply hash-map (mapcat ...)) you could do (into {} (map ...)) So accounts could be (into {} (map (juxt :key :doc_count) buckets)) – Sean Corfield Mar 01 '22 at 00:15
0

A different way of creating your accounts map:

user> (def buckets [{:key 14768496, :doc_count 464} {:key 14761312, :doc_count 440} {:key 14764921, :doc_count 412}])
#'user/buckets
user> (def accounts (zipmap (map :key buckets) (map :doc_count buckets)))
#'user/accounts

Alternatively ...

user> (defn find-doc-count [k buckets] (some #(when (= (:key %) k) (:doc_count %)) buckets))
#'user/find-doc-count
user> (find-doc-count 14768496 buckets)
464
dorab
  • 807
  • 5
  • 13