7

As part of a data visualization app that I'm working on, I've encountered something that's either a bizarre bug or me fundamentally not understanding something.

My application has code that takes data structures representing colorscales and transforms them into functions that take a number and return a hash of color RGB values.

Both gradient and range colorscales are implemented:

{:type :gradient
 :scale [{:bound 0 :r 0 :g 0 :b 0}
         {:bound 1 :r 255 :g 0 :b 0}
         {:bound 2 :r 0 :g 255 :b 0}]}

{:type :range
 :scale [{:bound [[< 0]] :r 250 :g 250 :b 250}
         {:bound [[>= 0] [< 1]] :r 0 :g 0 :b 0}
         {:bound [[>= 1] [< 2]] :r 255 :g 0 :b 0}
         {:bound [[>= 2]] :r 0 :g 255 :b 0}}]

There are functions that turn these into function, the usage of which resembles the following:

((create-colorscale-fn **GRADIENT-MAP**) 1.5) => {:r 128 :g 128 :b 0}
((create-colorscale-fn **RANGE-MAP**) 1.5) => {:r 255 :g 0 :b 0}

There are functions that convert between the two as well, but this one is the one relevant to my post:

(defn- gradient-colorscale-to-range
  [in]
  {:pre [(verify-gradient-colorscale in)]
   :post [(verify-range-colorscale %)]}
  {:type :range
   :scale (into []
        (concat
         (let [{:keys [bound]} (-> in :scale first)
               {:keys [r g b]} {:r 250 :g 250 :b 250}]
           [{:bound [[< bound]] :r r :g g :b b}])
         (mapv (fn [[a {:keys [r g b]}]] {:bound a :r r :g g :b b})
               (partition 2 (interleave
                     (map (partial apply vector)
                      (partition 2
                             (interleave
                              (map #(vector >= (:bound %)) (-> in :scale))
                              (map #(vector < (:bound %)) (-> in :scale rest)))))
                     (-> in :scale))))
         (let [{:keys [bound r g b]} (-> in :scale last)]
           [{:bound [[>= bound]] :r r :g g :b b}])))})

Part of the "verify-range-colorscale" function tests the following condition regarding the inequality operators:

(every? #{< <= > >=} (map first (mapcat #(-> % :bound) (:scale in))))
 ;;Each bound must consist of either <= < >= >

Here's where my problem lies:

For some reason, most of the time, when I run this function, it doesn't give me any problems, and the test for the appropriate inequality operators runs as it should:

(def gradient
    {:type :gradient
    :scale [{:bound 0 :r 0 :g 0 :b 0}
             {:bound 1 :r 255 :g 0 :b 0}
             {:bound 2 :r 0 :g 255 :b 0}]})

 (#{< <= > >=} (get-in (gradient-colorscale-to-range gradient) [:scale 0:bound 0 0])) 
     => #object[clojure.core$_LT 0x550b46f1 "clojure.core$_LT_@550b46f1

However, the colorscales are set inside an atom, the contents of which are found inside a global variable. There are editors that I've developed that copy part of the state of the colorscale into another atom, which is then edited using a graphical editor. When I convert the gradient to range inside the atom, associate the contents of the atom into the global atom, and THEN check the equality of the operators, for some bizarre reason the test fails.

 (#{< <= > >=} (get-in (gradient-colorscale-to-range gradient) [:scale 0:bound 0 0])) 
     => nil

When I check to see WHY it's failing, it appears that the hash code of the less than function changes at some point during the atomic updates.

(mapv #(format "%x" (.hashCode %)) [< (get-in @xmrg-cache [[0 0] :colorscale :scale 0 :bound 0 0])])
   -> ["550b46f1" "74688dde"]

And since set inclusion apparently tests functions based on their hashcode, this causes my "verify-range-colorscale" test to fail.

So the question is, why is the hash code of my inequality function changing during atomic updates? It's a function defined in clojure.core, but it seems like a copy of it is being made at some point?


Edit in response to Piotrek:

The data structure is stored in a global atom in the namespace "inav".

When loading the hashcode of <:

 (format "%x" (.hashCode <)) => "425b1f8f"

When changing a colorscale stored in the display configuration atom from the repl using the conversion function:

 (swap! xmrg-cache update-in [[0 0] :colorscale gradient-colorscale-to-range)
 (format "%x" (.hashCode (get-in @xmrg-cache [[0 0] :colorscale :scale 0 :bound 0 0]))) => "425b1f8f"

There's a graphical colorscale editor that uses a series of watches to edit temporary copies before updating the active configuration. It's launched by clicking on a colorscale preview image:

  (.addMouseListener colorscale-img-lbl
     (proxy [MouseAdapter] []
        (mouseClicked [me]
           (let [cscale-atom (atom (get-in @xmrg-cache [(find-pane-xy e) :colorscale]))]
              (add-watch cscale-atom :aoeu
                 (fn [k v os ns]
                     (swap! xmrg-cache assoc-in [(find-pane-xy parent-e) :colorscale] ns)
                     (redrawing-function)))
              (launch-colorscale-editor cscale-atom other-irrelevant-args))))

Then launch-colorscale-editor has a bunch of options, but the relevant parts are the conversion combobox and apply button:

(defn- launch-colorscale-editor [cscale-atom & other-irrelevant-args]
  (let [tmp-cscale-atom (atom @cscale-atom)
        convert-cb (doto (JComboBox. (to-array ["Gradient" "Range"]))
                      (.setSelectedItem ({:range "Range" :gradient "Gradient"} (:type @tmp-cscale-atom)))
        apply-button (JButton. "Apply")]
     (add-action-listener convert-cb
         (fn [] (let [prev-type (:type @tmp-cscale-atom)
                      new-type ({"Gradient" :gradient "Range" :range} (.getSelectedItem convert-cb))]
                   (when (not= prev-type new-type)
                     (case [prev-type new-type]
                           [:gradient :range] (swap! tmp-cscale-atom gradient-colorscale-to-range)
                           ;other options blah blah
                      )))))
     (add-action-listener apply-button
        (fn [] (reset! cscale-atom @tmp-cscale-atom)
               (redrawing-function))))

Basically, when you click apply, you're copying the contents of tmp-cscale-atom (inside of #'inav/create-colorscale-editor) into cscale-atom (inside of of a let-block in #'inav/more-grid-options-dialog), which triggers a watch that automatically copies the colorscale from cscale-atom into xmrg-cache (globally defined #'inav/xmrg-cache).

When editing it THIS way, the hashcode for < ends up being this

(format "%x" (.hashCode (get-in @xmrg-cache [[0 0] :colorscale :scale 0 :bound 0 0]))) => "5c370bd0"

A final note on this behavior:

When you call "redrawing-function" from INSIDE the apply-button action listener, the attempt to validate the range colorscale is successful.

When you call "redrawing-function" afterwards from OUTSIDE the apply-button action listener, the attempt to validate the range colorscale fails.

...and I just figured out the problem, I'm re-evaling the colorscale as part of my revalidation function called when I refresh the colorscale. This is messing things up.

  • Are you testing it in REPL? Could you provide a minimal example of this behavior with just `<` function along with steps how you run it? – Piotrek Bzdyl May 31 '17 at 19:10

3 Answers3

4

Functions in Clojure are regular Java objects implementing the clojure.lang.IFn interface. When you load a namespace (including clojure.core), Clojure will compile functions (generate a new Java class, create an instance of it, and assign that instance as a var value). For example, the #'clojure.core/< var will get a new Java object implementing clojure.lang.IFn that happens to be less-than logic.

Clojure doesn't override the hashCode implementation in the generated function class, which thus inherits the default one from java.lang.Object. Thus every new instance has its own potentially different hash code. This is causing your issues: when a namespace gets reloaded, vars will get new function instances and thus different hash codes.

On the other hand I would check how your test works:

  • Are there any namespaces reloaded during your tests' execution?
  • Are you storing a global state (e.g. < function in a global atom) outside of the test function scope?

Maybe you should use a local scope for expected values in your test functions instead?

Thumbnail
  • 13,293
  • 2
  • 29
  • 37
Piotrek Bzdyl
  • 12,965
  • 1
  • 31
  • 49
  • Thank you to everyone, but I marked this one right because it pointed me in the right direction with "are there any namespaces reloaded during your tests' execution". During the tests execution no, but during my data refresh/revalidation function yes, and that was messing things up. – sleepysilverdoor Jun 01 '17 at 11:48
3

I have been able to reproduce part of this behaviour by explicitly reloading clojure.core and observing that the hash code of the function changes when the namespace is reloaded, though the hash-code of the var containing that function does not change when clojure.core.

user> (.hashCode <) 
87529528
;; jump to clojure.core and reload namespace
user> (.hashCode <) 
228405583

user> (.hashCode #'<) 
1242688388
;; jump to clojure.core and reload namespace
user> (.hashCode #'<) 
1242688388

I can't with the code you have there tell what happens in your editing process that could cause these forms to be re-evaluated so there may be other causes of this. One workaround for you might be to store the var containing your test functions in the map rather than the function object directly. You can do this using the #' reader-macro.

Calling a var as a function automatically calls through to the function in the var so no changes elsewhere would be required.

Arthur Ulfeldt
  • 90,827
  • 27
  • 201
  • 284
  • 2
    Using a var would probably be a functional workaround, but I think it's conceptually cleaner to store the symbols instead (i.e., `'<` instead of `<`) and have a manual lookup table for the small number of mappings that you intend to actually allow. That way you couldn't store something surprising in there, and the data will be comfortably round-trippable through JSON, or whatever other serialization format, rather than being something that only makes sense in memory. Functions and vars really aren't designed to be treated as comparable values. – amalloy May 31 '17 at 19:29
2

Coincidentally, I noticed a related behavior just last week. When you define identical functions, they don't get the same hashcode:

(defn ink [x] (+ 1 x))
(spyx (hash ink))
(spyx ink)

(defn ink [x] (+ 1 x))
(spyx (hash ink))
(spyx ink)

(hash ink) => 539734147
ink => #object[tst.clj.core$ink 0x202bb083 "tst.clj.core$ink@202bb083"]

(hash ink) => 757183584
ink => #object[tst.clj.core$ink 0x2d21b460 "tst.clj.core$ink@2d21b460"]

So each defn is generating a new function object with a new hashcode (in fact the function object label 0x202bb083 is just the hex value of the hash 539734147). This behavior is identical to that seen when creating two separate java Object instances:

(hash (Object.)) => 1706817395
(hash (Object.)) => 969679245

Recall that the default implementation of Object.hashcode() is to simply derive an integer from the memory address of the object.

So the upshot is that we can't compare function objects for equality even when they are identical. So, we need a workaround where we store a token as a map key and a function instance as the corresponding map value. Here is one way:

(defn ink [x] (+ 1 x))
(defn dek [x] (- x 1))

(def sym->fn {'++ ink
              '-- dek})

(defn runner [form]
  (let [[fn-symbol val] form
        fn-impl         (get sym->fn fn-symbol)
        result          (fn-impl val)]
       result))

(runner '(++ 2)) => 3
(runner '(-- 5)) => 4
Alan Thompson
  • 29,276
  • 6
  • 41
  • 48