12

Given the following...

(def inTree
 '((1 2)
   (1 2 3)
   (1 2 4 5 9)
   (1 2 4 10 15)
   (1 2 4 20 25)))

How would you transform it to this trie?

(def outTrie
 '(1
    (2 ()
       (3 ())
       (4 (5
            (9 ()))
          (10
            (15 ()))
          (20
            (25 ()))))))
Bill the Lizard
  • 398,270
  • 210
  • 566
  • 880
Johnny
  • 1,144
  • 2
  • 11
  • 23

4 Answers4

16

Here's a cleaned up solution. This fixes a bug Brian's add-to-trie method since it's currently dependent upon you inserting the seqs in increasing-length order. It also allows querying the trie by prefix, which is a common use case.

Note the memory usage here is higher since it stores the values in the leaf nodes of the trie so you can perform searches.

(defn add-to-trie [trie x]
  (assoc-in trie x (merge (get-in trie x) {:val x :terminal true})))

(defn in-trie? [trie x]
  "Returns true if the value x exists in the specified trie."
  (:terminal (get-in trie x) false))

(defn prefix-matches [trie prefix]
  "Returns a list of matches with the prefix specified in the trie specified."
  (keep :val (tree-seq map? vals (get-in trie prefix))))

(defn build-trie [coll]
  "Builds a trie over the values in the specified seq coll."
  (reduce add-to-trie {} coll))
NielsK
  • 6,886
  • 1
  • 24
  • 46
Greg Fodor
  • 421
  • 4
  • 4
  • 1
    So Brian's version would be fine I guess if you always used the same number of keys? – Johnny Apr 14 '10 at 04:20
  • 1
    Your definition of `prefix-matches` uses a function `map-filter`, but there is no such function in the standard library. I tried to reverse-engineer what it does, but it isn’t obvious. Can you post its definition? – Rory O'Kane Jan 17 '15 at 04:34
  • `map-filter` is similar to `keep`, which is in the core lib. – NielsK Apr 25 '16 at 15:48
  • I've applied a minimal fix for the bug you mention to [@Brian's answer](http://stackoverflow.com/a/1453478/1562315). – Thumbnail May 14 '17 at 12:28
  • Is there a reason why you've chosen to include both a `:val` key and a `:terminal` key, when only terminal nodes have `:val` entries? – Eric Auld Sep 27 '21 at 23:37
10

Lists are very clumsy here, not to mention inefficient. In Clojure it's more idiomatic to use vectors and hash-maps and sets when appropriate. Using hash-maps:

(def in-tree
 '((1 2)
   (1 2 3)
   (1 2 4 5 9)
   (1 2 4 10 15)
   (1 2 4 20 25)))

(defn add-to-trie [trie x]
  (assoc-in trie `(~@x :terminal) true))

(defn in-trie? [trie x]
  (get-in trie `(~@x :terminal)))

If you wanted it to print sorted you could use sorted-maps instead, but you'd have to write your own version of assoc-in that used sorted maps the whole way down. In any case:

user> (def trie (reduce add-to-trie {} in-tree))
#'user/trie
user> trie
{1 {2 {4 {20 {25 {:terminal true}}, 10 {15 {:terminal true}}, 5 {9 {:terminal true}}}, 3 {:terminal true}, :terminal true}}}
user> (in-trie? trie '(1 2))
true
user> (in-trie? trie '(1 2 4))
nil
user> (in-trie? trie '(1 2 4 20 25))
true
Thumbnail
  • 13,293
  • 2
  • 29
  • 37
Brian Carper
  • 71,150
  • 28
  • 166
  • 168
  • 1
    Great answer and highlight that mine was actually incorrectly ignoring the substring issue. I would suggest a slightly different in-tri?: (defn in-trie? [trie x] (:terminal (get-in trie x) false)) user=> (in-trie? trie '(1 2 4)) false Makes it a true predicate and avoids splicing syntax. – Timothy Pratley Sep 21 '09 at 09:05
  • Maybe `::terminal`, in case we're trie-ing sequences with `:terminal` in them? – Thumbnail May 14 '17 at 11:30
  • I've fixed the bug [@GregFooter](http://stackoverflow.com/a/2511115/1562315) found. Feel free to reverse the edit. Unlike @TimothyPratley, I found your use of unquote-splicing helpful, as it show it as an operation on data - not trapped in a macro body. – Thumbnail May 14 '17 at 12:25
1

As a general approach, here's what I would do:

  • Write a few functions to create a trie and to insert new elements into a trie.
  • Create a new trie.
  • Iterate through the input list and insert each element into the trie.

This problem lends itself very well to a recursive implementation. I would aim for that if possible.

Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
1

I'm sure there is a prettier way (there was! see Brian's answer it is better):

(defn find-in-trie
  "Finds a sub trie that matches an item, eg:
  user=> (find-in-trie '(1 (2) (3 (2))) 3)
  (3 (2))"
  [tr item]
  (first (for [ll (rest tr) :when (= (first ll) item)] ll)))


(defn add-to-trie
  "Returns a new trie, the result of adding se to tr, eg:
  user=> (add-to-trie nil '(1 2))
  (1 (2))"
  [tr se]
  (cond
    (empty? se) tr
    (empty? tr) (add-to-trie (list (first se)) (rest se))
    :else (if-let [st (find-in-trie tr (first se))]
            (cons (first tr)
                  (cons (add-to-trie st (rest se))
                        (filter (partial not= st) (rest tr))))
            (cons (first tr)
                  (cons (add-to-trie (list (first se)) (rest se))
                        (rest tr))))))

(def in '((1 2)
          (1 2 3)
          (1 2 4 5 9)
          (1 2 4 10 15)
          (1 2 4 20 25)))

(reduce add-to-trie '(nil) in)

-> (nil (1 (2 (4 (20 (25)) (10 (15)) (5 (9))) (3))))

Note that I've chosen to use nil as the root node and have not bothered keeping empty lists to signify no children. Actually doing it this way is not correct as it does not preserve substring identity.

Timothy Pratley
  • 10,586
  • 3
  • 34
  • 63