1

Here is the situation: I have a vector of vectors ("data"), a set of headers, a subset of headers ("primary headers"), a constant ("C"), an element-wise function ("f"), and the remaining headers ("secondary headers"). My goal is to take the "data" and produce a new vector of vectors.

Example data:

[[1.0 "A" 2.0]
[1.0 "B" 4.0]]

Example headers:

["o1" "i1" "i2"]

Example primary headers:

 ["i1" "i2"]

Example secondary headers:

 ["o1"]

Example new vector of vectors:

[[(f "A") (f 2.0) C (f 1.0)]
[(f "B") (f 4.0) C (f 1.0)]]

My current attempt is to mapv each row, then map-indexed each element with an if to check for primary membership, then the constant, then map-indexed each element with an if to check for secondary membership, finally conj on the results. But I am not getting it to work right.

Example code:

(mapv (fn [row] (conj (vec (flatten (map-indexed
                                    (fn [idx item] (let [header-name (nth headers idx)] 
                                                        (if (= (some #{header-name} primary-headers) headers-name) (f item))))
                                    row)))

                  C
                  (vec (flatten (map-indexed
                                    (fn [idx item] (let [header-name (nth headers idx)] 
                                                        (if (= (some #{header-name} secondary-headers) headers-name) (f item))))
                                    row)))))
 data)
Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
user1559027
  • 343
  • 2
  • 13

3 Answers3

2

You should consider using core.matrix for stuff like this. It is a very flexible tool for multi-dimensional array programming in Clojure.

Most array-manipulation operations are likely to be 1-2 liners.....

(def DATA [[1.0 "A" 2.0]
           [1.0 "B" 4.0]])

(emap (partial str "f:") (transpose (mapv #(get-column DATA %) [1 0 2])))
=> [["f:A" "f:1.0" "f:2.0"] 
    ["f:B" "f:1.0" "f:4.0"]]

You might need to look up the column names to calculate the [1 0 2] vector but hopefully this gives you a good idea how to do this....

mikera
  • 105,238
  • 25
  • 256
  • 415
  • I've been thinking that I need a data structure to help me do these things. Still trying to just make this work for now. Do you think something like Incanter would be useful for all of this? – user1559027 Apr 14 '14 at 22:08
1

Not sure if I got your problem right, but looks like you want something like this:

(defn magic [data h p s f]
  (let [idx (map (into {} (map-indexed #(vector %2 %1) h))
                 (concat p s))]
    (mapv #(mapv (comp f (partial get %))
                 idx)
          data)))

Here is an example of my magic function:

(magic [[1.0 "A" 2.0]
        [1.0 "B" 4.0]]
       ["o1" "i1" "i2"]
       ["i1" "i2"]
       ["o1"]
       #(str "<" % ">"))

[["<A>" "<2.0>" "<1.0>"]
 ["<B>" "<4.0>" "<1.0>"]]

Let's get a closer look at it.

First of all, I'm calculating permutation index idx. In your case it's (1 2 0). In order to calculate it I'm turning ["o1" "i1" "i2"] into a hash map {"o1" 0, "i1" 1, "i2" 2} and then using it on ("i1" "i2" "o1") sequence of primary and secondary headers.

Then I'm using idx to rearrange data matrix. On this step I'm also applying f function to each element of new rearranged matrix.

Update

I thought that it'll be best to split my complicated magic function into three simpler ones:

(defn getPermutation [h1 h2]
  (map (into {} (map-indexed #(vector %2 %1) h1))
       h2))

(defn permutate [idx data]
  (mapv #(mapv (partial get %) idx)
        data)))

(defn mmap [f data]
  (mapv (partial mapv f)
        data))

Each function here is atomic (i.e. performing a single task), and they all could be easily combined to do exactly what magic function do:

(defn magic [data h p s f]
  (let [idx (getPermutation h (concat p s))]
    (->> data
         (permutate idx)
         (mmap f))))

getPermutation function here calculates idx permutation index vector.

permutate rearranges columns of a matrix data according to given idx vector.

mmap applies function f to each element of a matrix data.

Update 2

Last time I missed the part about adding a constant. So, in order to do so we'll need to change some of the code. Let's change permutate function allowing it to insert new values to the matrix.

(defn permutate [idx data & [default-val]]
  (mapv #(mapv (partial get %) idx (repeat default-val))
        data)))

Now, it'll use default-val if it won't be able to get the element with the specified index idx.

We'll also need a new magic function:

(defn magic2 [data h p s f c]
  (let [idx (getPermutation h (concat p [nil] s))]
    (permutate idx (mmap f data) c)))

I changed the order of applying mmap and permutate functions because it seems that you don't want to apply f to your constant.

And it works:

(magic2 [[1.0 "A" 2.0]
         [1.0 "B" 4.0]]
        ["o1" "i1" "i2"]
        ["i1" "i2"]
        ["o1"]
        #(str "<" % ">")
        "-->")

[["<A>" "<2.0>" "-->" "<1.0>"]
 ["<B>" "<4.0>" "-->" "<1.0>"]]
Leonid Beschastny
  • 50,364
  • 10
  • 118
  • 122
  • I still need to test this out, but where does the constant C get added in, between p and s? I'm excited if this works, it uses functions I'm not familiar with. – user1559027 Apr 11 '14 at 02:44
  • Looks like I missed the part about a constant. – Leonid Beschastny Apr 11 '14 at 06:06
  • @user1559027 I updated my answer wit the art about adding a constant. – Leonid Beschastny Apr 12 '14 at 21:25
  • I am so close to getting your solution working. My only trouble is that permutate is only returning the default value, despite getPermutation and mmap working correctly. – user1559027 Apr 15 '14 at 02:08
  • I marked this comment as the answer. It was the first one I got to work, after quite a bit of modification. You and @Thumbnail were both very educational. There were some extra steps I had to accommodate when passing `f` and in processing the results, but I worked those out. – user1559027 Apr 15 '14 at 03:11
1

Given

(def data [[1.0 "A" 2.0] [1.0 "B" 4.0]])
(def headers ["o1" "i1" "i2"])
(def primaries ["i1" "i2"])
(def secondaries ["o1"])

(defn invert-sequence [s] (into {} (map-indexed (fn [i x] [x i]) s)))

... this does the job:

(defn produce [hs ps ss f data const]
  (let [perms (map #(mapv (invert-sequence hs) %) [ps ss])]
    (mapv (fn [v] (->> perms
                       (map #(map (comp f v) %))
                       (interpose [const])
                       (apply concat)
                       vec))
          data)))

Using the example in the question:

(produce headers primaries secondaries #(list 'f %) data 'C)
; [[(f "A") (f 2.0) C (f 1.0)] [(f "B") (f 4.0) C (f 1.0)]]

Using Leonid Beschastny's example:

(produce headers primaries secondaries #(str "<" % ">") data 'C)
; [["<A>" "<2.0>" C "<1.0>"] ["<B>" "<4.0>" C "<1.0>"]]

Using str:

(produce headers primaries secondaries str data 'C)
; [["A" "2.0" C "1.0"] ["B" "4.0" C "1.0"]]

Using identity:

(produce headers primaries secondaries identity data 'C)
; [["A" 2.0 C 1.0] ["B" 4.0 C 1.0]]
Thumbnail
  • 13,293
  • 2
  • 29
  • 37
  • I've been trying for a while, but I can't get your solution to work. It has the constant, which is great, but it doesn't seem to work with applying the function. Function should be applied to each element. @Leonid-Beschastny has the right idea there. – user1559027 Apr 11 '14 at 19:34
  • @user1559027 What do you get when you run the code? I copied it back off the page and it works perfectly here, using either version of function `produce`. And the function (`#(list 'f %)` in the example) *is* applied to every element, as the output shows. This example function literally mimics your output - it does nothing useful. Meantime, I'll get rid of the untidy version of `produce`, to reduce the possibilities for confusion. – Thumbnail Apr 11 '14 at 19:58
  • I'll try to product something more helpful later, but for now I'm gettgin `clojure.lang.LazySeq cannot be cast to clojure.lang.IFn`. This is happening when I try to put my true f function directly in the subroutine. Even using "str" or "identity". This is while I am trying to adapt the code into an existing application. What you have may be correct, and it's just me not being able to understand it. – user1559027 Apr 11 '14 at 21:23
  • @user1559027 What's the actual text of your true `f` function? Put it in `code quotes`, to show how far it extends. Both `str` and `identity` work here. – Thumbnail Apr 11 '14 at 21:35
  • I will see what I can about getting `f` on here. The problem with `f` is that it requires looking up values in other vectors, which depends on knowing which position (according to `hs`) is being mapped. `map-indexed` was able to help, but I could not figure out the rearrangement of the items. – user1559027 Apr 12 '14 at 01:24
  • `(fn [idx item] (let [header-name (nth hs idx)] (if (= (some #{header-name} divide-headers) header-name) (vector (str (/ item (nth (nth divisors 1) (.indexOf (nth divisors 0) header-name))))) (vec (function-that-returns-a-list)))))` – user1559027 Apr 12 '14 at 01:31
  • `(def divisors [["i2" "o1"][25.0 1000.0]])` – user1559027 Apr 12 '14 at 01:35
  • @user1559027 Your function has two unbound symbols: `hs`, which I take to be my `headers`; and `divide-headers`, which I have no idea about. – Thumbnail Apr 12 '14 at 08:09