0

I feel I'm writing functions needlessly for the following operation of setting several derived columns sequentially:

(defn add-cols[d]
   (do
      (setv (get d "col0") "0")
      (setv (get d "col1") (np.where (> 0 (get d "existing-col")) -1 1))
      (setv (get d "col2") (* (get d "col1") (get d "existing-col")))
      d
      ))

The above is neither succinct nor easy to follow. I'd appreciate any help with converting this pattern to a macro. I'm a beginner with macros but am thinking of creating something like so :

(pandas-addcols d
   `col0 : "0",
   `col1 : (np.where ( > 0 `existing-col) -1 1),
   `col2 : (* `col1 `existing-col))

Would appreciate any help or guidance on the above. The final form of the macro can obviously be different too. Ultimately the most repetitive bit is the multiple "setv" and "get" calls and maybe there are more elegant a generic ways to remove those calls.

dedupe
  • 9
  • 2

1 Answers1

0

A little syntactic sugar that can help is to use a shorter name for get and remove the need to quote the string literal. Here's a simple version of $ from this library. Also, Hy's setv already lets you provide more than one target–value pair.

(import
  [numpy :as np]
  [pandas :as pd])

(defmacro $ [obj key]
  (import [hy [HyString]])
  `(get (. ~obj loc) (, (slice None) ~(HyString key))))

(setv
  d (pd.DataFrame (dict :a [-3 1 3] :b [4 5 6]))
  ($ d col0) 0
  ($ d col1) (np.where (> 0 ($ d a)) -1 1))
Kodiologist
  • 2,984
  • 18
  • 33
  • Thank you this already helps immensely. Wonder if we can abbreviate it even more in a block where we know the dataframe we'd be operating on already and return the full dataframe (i.e. the result of the 'do' in the example above). – dedupe Nov 06 '20 at 02:30
  • @dedupe See `wc` ("with columns") in the same library. – Kodiologist Nov 06 '20 at 02:48