0

I'm learning core.async and have written a simple producer consumer code:

(ns webcrawler.parallel
  (:require [clojure.core.async :as async
             :refer [>! <! >!! <!! go chan buffer close! thread alts! alts!! timeout]]))

(defn consumer
  [in out f]
  (go (loop [request (<! in)]
        (if (nil? request)
          (close! out)
          (do (print f)
            (let [result (f request)]
              (>! out result))
              (recur (<! in)))))))

(defn make-consumer [in f]
  (let [out (chan)]
    (consumer in out f)
    out))

(defn process
  [f s no-of-consumers]
  (let [in (chan (count s))
        consumers (repeatedly no-of-consumers #(make-consumer in f))
        out (async/merge consumers)]
    (map #(>!! in %1) s)
    (close! in)
    (loop [result (<!! out)
           results '()]
      (if (nil? result)
        results
        (recur (<!! out)
               (conj results result))))))

This code works fine when I step in through the process function in debugger supplied with Emacs' cider.

(process (partial + 1) '(1 2 3 4) 1)
(5 4 3 2)

However, if I run it by itself (or hit continue in the debugger) I get an empty result.

(process (partial + 1) '(1 2 3 4) 1)
()

My guess is that in the second case for some reason producer doesn't wait for consumers before exiting, but I'm not sure why. Thanks for help!

radious
  • 812
  • 6
  • 21

3 Answers3

3

The problem is that your call to map is lazy, and will not run until something asks for the results. Nothing does this in your code.

There are 2 solutions:

(1) Use the eager function mapv:

(mapv #(>!! in %1) items)

(2) Use the doseq, which is intended for side-effecting operations (like putting values on a channel):

(doseq [item items]
  (>!! in item))

Both will work and produce output:

(process (partial + 1) [1 2 3 4] 1) => (5 4 3 2)

P.S. You have a debug statement in (defn consumer ...)

(print f)

that produces a lot of noise in the output:

<#clojure.core$partial$fn__5561 #object[clojure.core$partial$fn__5561 0x31cced7
"clojure.core$partial$fn__5561@31cced7"]>

That is repeated 5 times back to back. You probably want to avoid that, as printing function "refs" is pretty useless to a human reader.

Also, debug printouts in general should normally use println so you can see where each one begins and ends.

Alan Thompson
  • 29,276
  • 6
  • 41
  • 48
1

I'm going to take a safe stab that this is being caused by the lazy behavior of map, and this line that's carrying out side effects:

(map #(>!! in %1) s)

Because you never explicitly use the results, it never runs. Change it to use mapv, which is strict, or more correctly, use doseq. Never use map to run side effects. It's meant to lazily transform a list, and abuse of it leads to behaviour like this.

So why is it working while debugging? I'm going to guess because the debugger forces evaluation as part of its operation, which is masking the problem.

Carcigenicate
  • 43,494
  • 9
  • 68
  • 117
1

As you can read from docstring map returns a lazy sequence. And I think the best way is to use dorun. Here is an example from clojuredocs:

;;map a function which makes database calls over a vector of values 
user=> (map #(db/insert :person {:name %}) ["Fred" "Ethel" "Lucy" "Ricardo"])
JdbcSQLException The object is already closed [90007-170]  org.h2.message.DbE
xception.getJdbcSQLException (DbException.java:329)

;;database connection was closed before we got a chance to do our transactions
;;lets wrap it in dorun
user=> (dorun (map #(db/insert :person {:name %}) ["Fred" "Ethel" "Lucy" "Ricardo"]))
DEBUG :db insert into person values name = 'Fred'
DEBUG :db insert into person values name = 'Ethel'
DEBUG :db insert into person values name = 'Lucy'
DEBUG :db insert into person values name = 'Ricardo'
nil