4

Using the following recursive definition of a depth first search a Clojure (JVM) and ClojureScript (tested with both browser connected repl and lumo) REPL produce two different outputs, i.e. order in which nodes are printed is different and Clojure REPL produces a duplicate :f. The ClojureScript order is the behaviour I would expect. Why is this?

Code:

(defn dfs
  ([g v] (dfs g v #{}))
  ([g v seen]
   (println v)
   (let [seen (conj seen v)]
     (for [n (v g)]
       (if-not (contains? seen n)
         (dfs g n seen))))))

(def graph {:a [:b :c :e]
            :b [:d :f]
            :c [:g]})

(dfs graph :a)

Cloure REPL output:

:a
:b
:c
:e
:d
:f
:g
:f
;; => ((() ()) (()) (()))

CLojureScript REPL output:

:a
:b
:d
:f
:c
:g
:e
;; => ((() ()) (()) ())
mac
  • 9,885
  • 4
  • 36
  • 51
  • 2
    In both clojure 1.8 and 1.9-alpha14, I get no `:f` at the end, and my parens are identical to your cljs parens. – Josh Feb 27 '17 at 13:09
  • That is weird, I am using Clojure 1.8.0 and ClojureScript 1.9.229. – mac Feb 27 '17 at 13:47

1 Answers1

6

Clojure's for generates a lazy sequence. The actual evaluation of all recurrent dfs calls is triggered only by your REPL as it needs to print the function's output, i.e. ((() ()) (()) ()). If you evaluate (do (dfs graph :a) nil), you will only get :a printed.

Now, Clojure's lazy sequences are evaluated in chunks of size 32 for efficiency. So, when the REPL (through str function) evaluates the first element lazy sequence the top level for (which should print :b), the other elements of that seq are also evaluated and you get :c and :e printed before the child nodes' sequences are evaluated (which are lazy too).

In contrast, Clojurescript's lazy sequences are not chunked (LazySeq does not implement IChunkedSeq) and are evaluated one-by-one, so when the return value is recursively converted to string, everything is evaluated in depth-first order.

To illustrate this - try (first (for [i (range 300)] (do (println "printing:" i) i))) in the REPL both in Clojure and CLJS - you will get 32 numbers printed in clojure and only one number in CLJS.

If you want better guarantees on the order of evaluation, you can use doseq instead of for or wrap the for in doall.

Hope this helps.

Side note: just as @Josh, I get no :f in the end in Clojure 1.8, and the parens are identical to the cljs output - that's really weird...

I am not sure I am following how you currently want to use the result of your DFS. If you want to use side effects, i. e. print all the nodes to the console, use doseq to make sure that they are traversed:

(defn dfs-eager
  ([g v] (dfs-eager g v #{}))
  ([g v seen]
   (println v)
   (let [seen (conj seen v)]
     (doseq [n (v g)]
       (if-not (contains? seen n)
         (dfs-eager g n seen))))))

This will print all the nodes to the console, depth-first. If you want to get the traversal as a return value, use for but make sure you actually return a meaningful value:

(defn dfs-lazy
  ([g v] (dfs-lazy g v #{}))
  ([g v seen]
   (cons v
         (let [seen (conj seen v)]
           (for [n (v g)]
             (if-not (contains? seen n)
               (dfs-lazy g n seen)))))))

You will get a nested list (:a (:b (:d) (:f)) (:c (:g)) (:e)) - which you can then flatten to get a traversal. You will also get the benefits of laziness.

Aleph Aleph
  • 5,215
  • 2
  • 13
  • 28
  • Thanks for the answer. It seems crazy to me that evaluation order would be affected by the chunking. Also why would the other elements be output to the REPL when I am only printing ```v```. If I remove the ```println``` call nothing is printed. – mac Feb 28 '17 at 15:20
  • The problem is that you are mixing side-effectful code and lazy code. In your case, the order (and the very fact) of evaluation depends not only on chunking, but also on whether your REPL outputs the return value of the function `((() ()) (()) ())` to the console - which is by itself crazy if you think about it. You can either use imperative code if you want to output the result to the console, or functional / lazy code if you want the result as a return value (please see a couple of examples in the edit to my post above). But mixing the two approaches is not proper Clojure. – Aleph Aleph Feb 28 '17 at 19:38