8

Apologies for any mistaken terminology--I'm quite new to computer science, and I pretty much only know Clojure (but I guess I'd say I know it pretty well).

So, I haven't done a ton of research on this, but I've sometimes found it useful when writing Clojure code to be able to refer to some "intermediate version of whatever data structure I'm in" from within that data structure (a lot like in a let). Quick examples:

=> (self-ish {:a 10
              :b (inc (this :a))
              :c (count (vals this))})
=> {:a 10, :b 11, :c 3}
=> (self-ish ["a" "b" (reduce str this)])
=> ["a" "b" "ab"]
//Works in any nested bits too
=> (self-ish [1 2 3 [4 5 (first this)] 6 [7 [8 (cons (second this) (nth this 3))]]])
=> [1 2 3 [4 5 1] 6 [7 [8 (2 4 5 1)]]]

The idea is that the structure builds itself up incrementally, and at any stage has the ability to refer to the current intermediate structure as this. Here's the code for my current implementation:

//Random straightforward but helpful definitions
(defn map-entry? [obj]
  (instance? clojure.lang.AMapEntry obj))
(def Map clojure.lang.IPersistentMap)
(def Vector clojure.lang.IPersistentVector)
(def List clojure.lang.IPersistentList)
(def Set clojure.lang.IPersistentSet)

(defn append
  [x coll]
  (if-not coll x
    (condp instance? coll
      Map (if (empty? x) coll
            (assoc coll (first x) (second x)))
      Vector (conj coll x)
      Set (conj coll x)
      List (apply list (concat coll [x]))
      (concat coll [x]))))

(defn build-this
  [acc-stack acc]
  (->> (cons acc acc-stack)
       (drop-while list?)
       (drop-while (every-pred empty? identity))
       (reduce append)))

(defn self-indulge
  [acc-stack acc form]
  ;//Un-comment the following to see it print intermediate stages of processing
  #_(println "this:" (build-this acc-stack acc) "\n  at:" form)
  (append (cond
            (coll? form) (reduce (partial self-indulge (cons acc acc-stack))
                                 (if (map-entry? form) []
                                   (empty form))
                                 form)
            (= (quote this) form) (build-this acc-stack acc)
            :else form)
          acc))

(defmacro self-ish
  [form]
  (self-indulge () nil form))

The append function appends an item onto a collection and returns the same type of collection. The self-indulge function has a standard reduce-like accumulator, which just builds up elements of form. It also has a accumulator stack, which gets longer every time self-indulge recurs upon itself. The point of this is to keep track of other "higher up" accumulators, so that this will be the entire structure, not just a local piece. The self-ish macro just nicely wraps up self-indulge (which calls itself using partial, so it can't wear the macro pants).

Edit: example use case To me, this macro is about trying to increase code readability, not truly extending functionality. Where I have found this useful is in cases where I have hand-written structures with partially redundant data--or maybe "dependent" is a better word. It can be easier to read the code and see what different parts of the data structure hold, and it can also be useful if I modify data values in one part of a structure and want that change to be reflected in other parts. For example:

=> (self-ish {:favorite-books (list "Crime and Punishment" "Mrs. Dalloway")
              :favorite-things (list* "Ice Cream" "Hammocks" (this :favorite-books)})
=> {:favorite-things ("Ice Cream" "Hammocks" "Crime and Punishment" "Mrs. Dalloway"),
    :favorite-books ("Crime and Punishment" "Mrs. Dalloway")}

It could also be useful in times where one might really like to include something baked into the data, as opposed to derived on the fly using some function. These cases are probably much rarer, and I think it would be a bad idea to tangle the data unnecessarily when you could just have nice clean functions manipulating it.

My main questions:

  1. Is this actually useful, or would the ambiguity/complexity incurred be too much? I imagine I'm not alone in wanting/using this type of macro. What are others' experiences here? Do you use something like this? Have you found better workarounds? Are there reasons something like this isn't in any Clojure libraries? Or is there something that I haven't yet seen?
  2. Are there better naming conventions I might use--as opposed to self-ish and this? For example, maybe this is too loaded with OOP meaning, I'm not sure, I am basically only familiar with Clojure.
  3. I am pretty new to computer science, are there accessible and informative resources related to this type of thing--I guess I would call it anonymous self referential (maybe reflexive is a better word?) data structures? I haven't found anything both approachable and informative yet.
  4. Is there a better way to write the self-ish macro? Above, I've included my current version of it, but I can't shake the feeling there may be a simpler way.
  5. I have various questions about what might be the "wisest" implementation details.

    • Traversal: Should it be breadth first or depth first? If depth first, preorder, postorder, or inorder? Right now, I believe it's depth first preorder, which makes sense to me, but maybe it has some drawbacks I haven't noticed.
    • Order problems: (See here for a related previous question of mine) Within {} (i.e. maps written by hand) it's impossible to maintain order properly (above 8 map entries) without using array-map or sorted-map--in other words, above 8 map entries, {} usage is unsafe. Maybe instead of hand-written order, the macro could do some fancy magic to process items in some "ideal" order? Or perhaps it would be better to wrap all maps within (array-map ...) instead of the eye-pleasing {}?

      //Showing maps with 9 entries failing
      => (self-ish {:a 1
                    :b (inc (this :a))
                    :c (inc (this :b))
                    :d (inc (this :c))
                    :e (inc (this :d))
                    :f (inc (this :e))
                    :g (inc (this :f))
                    :h (inc (this :g))
                    :i (inc (this :h))})
      => NullPointerException   clojure.lang.Numbers.ops (Numbers.java:942)
      //8 works fine
      => (self-ish {:a 1
                    :b (inc (this :a))
                    :c (inc (this :b))
                    :d (inc (this :c))
                    :e (inc (this :d))
                    :f (inc (this :e))
                    :g (inc (this :f))
                    :h (inc (this :g))})
      => {:h 8, :g 7, :f 6, :e 5, :d 4, :c 3, :b 2, :a 1}
      
    • Serial: As I've written it, the macro avoids infinite recursion by dealing with its elements serially, similar to let, but that does produce potentially odd behavior. For example, in my above quick example, (reduce str this) returns "ab" because this is ["a" "b"] at that step. Maybe it would be useful sometimes to create some sort of infinite lazy sequence instead? If so, how might that be implemented?

    • Map entries: Right now, map entries are treated like vectors, but because of how this could be invoked at any intermediate step, it is totally possible to get a nil value from a key that has "not yet" been assigned a value. That is why in my first quick example, :c ended up mapped to 3--because intermediately there was a nil corresponding to :c, and that got counted as well. Do you think this warrants fixing?
    • Non-macro utility: It would be trivial to use just self-indulge outside of the macro context, but could this ever be useful?

Thanks for reading, any help is appreciated :)

Community
  • 1
  • 1
Omri Bernstein
  • 1,893
  • 1
  • 13
  • 17
  • Looks interesting, but could you mention any useful use case for such thing other than making things complex and ambiguous. – Ankur Jul 25 '12 at 16:22
  • Looks philosophical, maybe this is the wrong StackExchange site to ask this. – sortega Jul 25 '12 at 16:57
  • @Ankur Good idea, I'll edit to include a use case. It's not that it expands any capabilities, it just potentially makes writing code more straightforward/modular. Honestly though, I'm not sure it's really worth the added ambiguity/complexity, as you point out. That was what I was trying to get at with my first question, I should edit that, too. – Omri Bernstein Jul 25 '12 at 20:17
  • @sortega Would you suggest any different one in particular? I'm new to StackExchange. – Omri Bernstein Jul 25 '12 at 20:19

2 Answers2

2

in scheme this is done with (letrec ...) which lets you refer to the name of the data structure inside the structure itself. so if you want to define your own way of doing this it might make more sense to implement that. you could do it using the tricks with references described in the answers to Is it possible to create circular references in Clojure?

one advantage of letrec is that it has a user-specified name (if your macro is nested then this is shadowed).

[edited to remove comment on types as i don't understand your macro.]

[update] also, you may be interested in the discussion of anaphora in joy of clojure section 8.5.1

Community
  • 1
  • 1
andrew cooke
  • 45,717
  • 10
  • 93
  • 143
  • Thanks for pointing out `letrec`, I had seen that somewhere and forgotten--seems to be super relevant. It seems like it would be quite straightforward to make something similar to `letrec` with my code. Maybe I'm not understanding fully, but I'm not sure I agree that something like that circular reference example would be better, if only because it adds an unnecessary atom. Yes, my code does test for certain types, but only to maintain typing of any nested data structures as it traverses--currently, I don't see a way around this, even if I used something like that circular reference. – Omri Bernstein Jul 26 '12 at 00:24
  • if i understood right the atom is discarded at the end. so in the second example in that link, you'd finally use `@a`. i don't understand your macro tbh. – andrew cooke Jul 26 '12 at 01:31
  • oh, it's because of literals that the reader is creating i suppose? ok. in that case you should have set too. hmmm. nope, still don't get it. sorry. – andrew cooke Jul 26 '12 at 01:35
  • I can tend to get caught up in my own thoughts and not realize that what I'm saying is unclear. I re-factored the code a bit and added a line at the top of `self-indulge` you can un-comment to see the intermediate stages of `this` as well as the current `form` printed out. Tell me if this helped and/or if there are other things that are confusing or obtuse. – Omri Bernstein Jul 26 '12 at 02:50
  • Ah, I see what you are saying, yes good point, I do need to account for sets, I'll fix that in a second. Also though, I wouldn't get caught up with the typing things. Those are only in `append`, where I want to make sure that I append some item onto the end of a collection, but maintaining the type of that collection. This is necessary for the same reason that it is necessary in, for example, `postwalk`. I'm traversing the entire data structure and I need some way to preserve the types of various collections within it. – Omri Bernstein Jul 26 '12 at 03:01
2

This approach "feels" a bit wrong to me, though I'm not quite sure why. Maybe I don't like the idea of map construction being dependent on order....

Having said that it's a pretty easy macro to write, you effectively want something that expands to:

(let [this {}
      this (assoc this :a 1)
      this (assoc this :b (+ (this :a) 3))]
  this)

Hence an appropriate macro would be something like (for the map case):

(defmacro self-ish [bindings]
  `(let [~'this {}
         ~@(mapcat 
           #(do `(~'this (assoc ~'this ~@%)) )    
           (partition 2 bindings) )]
    ~'this))

(self-ish [:a 1
           :b (+ (this :a) 3)])
=> {:b 4, :a 1}

Note that I'm made the binding form a vector as a map binding form is unordered.

Still not sure how much I like this idiom. My preferred way is usually to define a structure with a let form and give meaningful names to interim calculations e.g.:

(let [meaningful-foo (something)
      meaningful-bar (something-else)]
   {:foo meaningful-foo
    :bar meaningful-bar
    :baz (some-calculation meaningful-foo meaningful-bar)})
mikera
  • 105,238
  • 25
  • 256
  • 415
  • 1
    Fantastic! The `let` at the bottom is totally a great idea, and really simple--I just never thought of that for some reason! In some cases it would be more verbose, but probably almost always it would be much simpler and clearer than my `self-ish` approach. As for your version of the `self-ish` macro, as it stands currently, aside from only working on maps, it also does not work on nested data. Ultimately, you *could* make the macro expand as a let with intermediate `this`s, where those intermediates would probably be some `reductions` call on `self-indulge`, but I don't see the advantage(s). – Omri Bernstein Jul 26 '12 at 04:06