3

EDIT #2: This entire question and exploration were based on my missing the fundamental notion of zippers; that they represent a perspective in a datastructure from the point of view of a particular node. So a zipper is - at all times - a pair of the current node and what the rest of the tree looks like from the perspective of that node. I was originally trying to generate a whole new structure from the zipper, while the zipper itself was all I needed, all along. I'll leave this all up for posterity, in the hope that somebody else is helped by it (or so it serves as a warning to any successors!).

Original question:

I'm trying to get my head around using zippers to manipulate trees. The specific problem is that I need to generate at runtime routes between two nodes matching arbitrary criteria in an arbitrary tree.

I thought I could use the path function to get a route to a location by calling path on the current location. But the path returned seems to omit the last step(s) required to get there.

For example:

(def test-zip (vector-zip [0 [1] [2 [3 [4] 5 [6 [7] [8]]]]]))
(-> test-zip 
    down right right down 
    right down right right
    node)

gives 5, but

(-> test-zip 
    down right right down 
    right down right right
    path)

gives

[[0 [1] [2 [3 [4] 5 [6 [7] [8]]]]] [2 [3 [4] 5 [6 [7] [8]]]] [3 [4] 5 [6 [7] [8]]]]

which isn't the same location (it's missing the effect of the last three steps, down right right).

It looks like the path function only gets you to the parent location in the tree, ignoring any siblings between you and the actual location.

Am I missing the point of the path function? I'd assumed that given a tree and a path, applying the path to the tree would bring you to the original location of the path, not partly there.

UPDATE: I've used the following function definition to compile a path of nodes from a start location to an end location:

(defn lca-path [start-loc end-loc]
  (let [sczip (z/root start-loc)
        start-path (z/path start-loc)
        start-node (z/node start-loc)
        end-path (z/path end-loc)
        end-node (z/node end-loc)
        lca-path (filter (set start-path) end-path)
        lca-node [(last lca-path)]
        lca-to-start (conj (vec (drop (count lca-path) start-path)) start-node)
        lca-to-end (conj (vec (drop (count lca-path) end-path)) end-node)
        ]

    (concat (reverse lca-to-start) lca-node lca-to-end))
  )

Pretty heavily influenced by the chat with @Mark Fisher, thanks!

Oliver Mooney
  • 155
  • 1
  • 7

1 Answers1

4

I think you're right, it's the parent it paths to, and this is confirmed looking at this site, path returns "all subtrees from the root of the tree down to just above the current loc".

For your structure, the tree is:

   A
 / | \
0  o  B
   | / \
   1 2  C __
       /|\  \
      3 o 5  o
        |   /|\
        4  6 o o
             | |
             7 8

So the 3 nodes marked A, B, C are the parent nodes what lead to the parent of 5, which are:

A: [0 [1] [2 [3 [4] 5 [6 [7] [8]]]]]
B: [2 [3 [4] 5 [6 [7] [8]]]]
C: [3 [4] 5 [6 [7] [8]]]

which is the result you're getting.

EDIT: I created this function to search between two values in a vector and return the path between them. Does this help? I'm not an expert on zippers so this is learning for me too.

(defn between [v s e]
  (let [zv (zip/vector-zip v)
        find-loc (fn [zv l]
                   (loop [loc zv]
                     (if (= l (zip/node loc))
                       loc 
                       (if (zip/end? loc)
                           nil
                           (recur (zip/next loc))))))
        sv (zip/vector-zip (-> (find-loc zv s) zip/up zip/node))
        e-in-sv (find-loc sv e)
        path-s-e (when (some? e-in-sv)
                   (zip/path e-in-sv))]
    path-s-e))

It finds the path between the parent of the start value and parent of end value, and you can use it like:

(def sov [0 [1] [2 [3 [4] 5 [6 [7] [8]]]]])
(between sov 2 6)
=> [[2 [3 [4] 5 [6 [7] [8]]]] [3 [4] 5 [6 [7] [8]]] [6 [7] [8]]]

I'm walking the tree to find the parent of the starting value, and then create a new zipper from its parent so that the path is limited from start point. The helper function "find-loc" returns a zipper for the entry you're looking for, e.g. "5" within a vector. sv is then the parent vector containing the start value.

Where there is no path it returns nil, e.g. between 1 and 4 you'd have to traverse up 2 elements.

This was in response to your comments of having to find the value in the final node (e.g. 5 in [3 [4] 5 [6 [7] [8]]]), which i don't think you need to do, but without the real example of what you're doing is difficult to comment on.

BTW, there may be much better ways of traverse/finding values in the vector than the custom function, but as I say zippers are pretty new for me too.

Mark Fisher
  • 9,838
  • 3
  • 32
  • 38
  • So once you get to the parent subtree, you have to search for the required node manually in the immediate descendants of the last sub-tree? Given that the immediately descending nodes are `3`. `[4]`, `5` and `[6 [7] [8]]` that's not too wide a search in this case. But is it not strange that the same path is returned for all four locations? Is there some other idiomatic way of capturing the route to a location in a zipper I'm missing? – Oliver Mooney May 08 '15 at 13:39
  • I'm kind of sharing the journey with you on this one. I've updated my answer to include a finder over your vector between 2 points. Does this fit what you're trying to achieve? – Mark Fisher May 08 '15 at 15:21
  • Also I think it's this way because you're using simple vector-zips rather than some custom zipper over your data. – Mark Fisher May 08 '15 at 15:48
  • Doesn't this duplicate the functionality of the `path` function? Once you find the start location and end location, and call path on both of those, you can do `(filter (set start-path) end-path)` to get the lowest common ancestor of both points. I also do `lca-to-start-parent (drop (count lca-loc) start-path)` to get the path from the lowest common ancestor to the start point, and `(drop (count lca-loc) end-path)` for the other leg to the end point. I like the self-contained implementation you have, and how you could use identical? to navigate to the parent location... – Oliver Mooney May 08 '15 at 15:48
  • Just saw your next comment - I'm actually using vectors as a simplified example to get to grips with zippers, I'm using a custom zipper for my actual data but wanted to rule it out as a complicating factor – Oliver Mooney May 08 '15 at 15:53
  • possibly yes! I was trying to get a better understanding of your problem myself and thought i'd share. there's a good section in Clojure Programming on pathing between 2 points in a maze using a custom zipper using path (http://www.clojurebook.com/). I think the nature of your question of there being the same path to a same number of elements is more a property of using vector-zippers rather than it being a general issue with zippers.... and this comment crossed with your comment :) – Mark Fisher May 08 '15 at 15:54
  • Probably worth noting too that we're both assuming unique values for nodes (true in my case) – Oliver Mooney May 08 '15 at 15:55