3

Supposing I had:

(def a-map {:foo "bar" :biz {:baz "qux"}})

How would I find the path of keys to a given value "qux" such that

(get-in a-map <the resulting path>) 

would return "qux"?

In other words, a function that takes a-map and "qux" and returns [:biz :baz].

I would then be able to use the returned path like this:

 (get-in a-map [:biz :baz])

and get "qux".

The paths I need are going to be far more nested than this simple example.

I am wanting to find the path to a given value in html that has been parsed into an array map using hickory. I want to do this without having to try to mentally navigate down through dozens of nested key/values. I'm open to other strategies.

THX1137
  • 903
  • 6
  • 15

2 Answers2

10

you can employ zipper for that: like this, for example:

user> (require '[clojure.zip :as z])
nil

user> 
(loop [curr (z/zipper coll? seq nil a-map)]
  (cond (z/end? curr) nil
        (-> curr z/node (= "qux")) (->> curr
                                        z/path
                                        (filter map-entry?)
                                        (mapv first))
        :else (recur (z/next curr))))
;;=> [:biz :baz]

or the same, but in a more 'declarative' style:

(some->> a-map
         (z/zipper coll? seq nil)
         (iterate z/next)
         (take-while (complement z/end?))
         (filter #(= (z/node %) "qux"))
         first
         z/path
         (filter map-entry?)
         (mapv first))

update

you can also use the classic recursive approach:

(defn get-path [endpoint data]
  (cond (= endpoint data) []
        (map? data) (some (fn [[k v]]
                            (when-let [p (get-path endpoint v)]
                              (cons k p)))
                          data)))

user> (get-path "qux" a-map)
;;=> (:biz :baz)
leetwinski
  • 17,408
  • 2
  • 18
  • 42
  • 1
    Your classic recursive approach takes quadratic time to construct the answer as a vector once the `endpoint` has been found. You can construct it as a list in linear time. Probably insignificant. – Thumbnail Apr 25 '20 at 17:09
  • 2
    @Thumbnail , indeed. I've done this consciously, for the sake of readability. (to avoid one more level of indirection for list->vector conversion) – leetwinski Apr 26 '20 at 08:19
  • 2
    I thought you might have. But there is no requirement for the *path of keys* to be a vector. And the relevant standard functions, `get-in` and `assoc-in`, accept any sequence of keys. I'll concede that the structure is hardly likely to be very deep, and the depth certainly can't exceed the limit for recursion. – Thumbnail Apr 26 '20 at 10:01
  • 2
    wow. all these years i was *absolutely sure* they only accept vectors. It should obviously be `(cons k p)` then! updated. – leetwinski Apr 26 '20 at 10:18
2

There are 2 ways you can solve this with the Tupelo library. The first uses the function walk-with-parents-readonly. When you find the node you want, you save off all of the parent nodes, which could be processed to give the information you want:

(ns tst.demo.core
  (:use demo.core tupelo.core tupelo.test)
  (:require [tupelo.forest :as tf]))

(dotest
  (let [result (atom nil)
        data   {:foo "bar" :biz {:baz "qux"}}]
    (walk-with-parents-readonly data
      {:enter (fn [parents item]
                (when (= item "qux")
                  (reset! result parents)))})
    (is= @result
      [{:foo "bar", :biz {:baz "qux"}}
       [:biz {:baz "qux"}]
       {:type :map-val, :value {:baz "qux"}}
       {:baz "qux"}
       [:baz "qux"]
       {:type :map-val, :value "qux"}]))

You can also use the tupelo.forest library, which is designed for processing HTML and other tree-like structures

  (tf/with-forest (tf/new-forest)
    (let [hiccup      [:foo
                       [:bar
                        [:baz "qux"]]]
          root-hid    (tf/add-tree-hiccup hiccup)
          path-raw    (only (tf/find-paths root-hid [:** :baz]))
          path-pretty (tf/format-path path-raw) ]
      (is= path-pretty
        [{:tag :foo}
         [{:tag :bar}
          [{:tag :baz, :value "qux"}]]]) )))

Please see also the example for extracting a permalink from the XKCD comic webpage.

Alan Thompson
  • 29,276
  • 6
  • 41
  • 48