Agent not getting executed

Question

I have a series of functions (like some-operation in the example), which I send or send-off to agents:

(defn some-operation [agent-state]
  (dosync
   (let [updated (foo agent-state)] ;; derive new state from old one
     (alter bar whatev updated) ;; reflect the new state in the world
     (send *agent* some-operation) ;; "recur"
     updated) ;; value for recur
   ))

(send (agent {}) some-operation)

This approach has worked for me as I was developing my app. But after some changes in the codebase, the agents simply stop running after a while ('a while' being some seconds - a few thousands "recursive" calls).

Their state is valid in the domain, the agents themselves haven't FAILED, and I am certain that they are not livelocking on their dosync blocks (one can measure contention).

My suspicon is that the JVM/OS is preventing the underlying executor thread from running, for some or other reason. But I don't know how to check whether this assumption is right.

In general, what are some possible reasons why a send agent might not get its pending "sends" executed? What can I inspect/measure?

Update - given the following modification for debugging...

(defn some-operation [agent-state]
  (let [f (future
            (dosync
             ...) ;; the whole operation, as per the previous example
            )]
    (Thread/sleep 1000) ;; give the operation some time
    (if (realized? f)
      @f

      ;; not realized: we deem the operation as blocking indefinetely
      (do
        (println :failed)
        (send *agent* some-operation)
        agent-state))))

...the agent still gets stuck, and doesn't even print :failed.

Does the agent or ref have any [validators](http://clojuredocs.org/clojure_core/clojure.core/set-validator!) added to them? — juan.facorro, Jun 24 '13 at 12:19
Yes, some of the refs these functions operate upon have associated validators. — deprecated, Jun 24 '13 at 12:22
If the new state for the ref is not validated by any of its validators, then all `send` or `send-off` on agents within the transaction may be discarded. — juan.facorro, Jun 24 '13 at 12:25
It seems likely that that is being the case, will check it out. In any case, the failure-silence of this behavior is pretty undesirable... — deprecated, Jun 24 '13 at 12:29
Validation didn't happen to be the source of my problems. Any other ideas? — deprecated, Jun 24 '13 at 13:12
In general you should do as much outside the `dosync` as possible. `(defn s-o [a-s] (let [u (foo a-s)] (send *agent* s-o) (dosync (alter bar whatever u)) u))` For debugging: you might want to bisect the mentioned changes in the code base to identify the culprit. — kotarak, Jun 24 '13 at 14:27
How about a minimal case to reproduce the failure conditions? — John Cromartie, Jun 24 '13 at 17:16
If you `send` inside a transaction but the transaction fails (i.e. due to a validator) then the `send` will be cancelled. — John Cromartie, Jun 24 '13 at 17:29

score 0 · Answer 1 · answered Jun 24 '13 at 21:44

It's worth being aware of the way send and dosync interact. All calls to send in a dosync happen exactly once, and only once the transaction commits This prevents messages being delivered to an agent form a transaction that is later discarded. You could test this by shrinking the scope of the dosync

score 0 · Answer 2 · edited May 23 '17 at 11:57

0

Send pool is limited so only certain amount of agents can be executed at the same time (see this answer). May this be the case?

edited May 23 '17 at 11:57

Community

1
1

answered Jul 23 '13 at 10:42

Niki Tonsky

1,327
11
19

Agent not getting executed

2 Answers2