6

I have the following dataframe (sorry for not providing an example with dput, it doesn't seem to work with lists when I paste it here):

data

Now I am trying to create a new column y that takes the difference between mnt_opeand ref_amountfor each element of ref_amount. The result would be, in each row, a list with the same number of elements as the corresponding value of ref_amount.

I have tried:

data <- data %>%
   mutate( y = mnt_ope - ref_amount)

But I get as error:

Evaluation error: non-numeric argument to binary operator.

With dput :

structure(list(mnt_ope = c(500, 500, 771.07, 770.26, 770.26, 
770.26, 770.72, 770.72, 770.72, 770.72, 770.72, 779.95, 779.95, 
779.95, 779.95, 2502.34, 810.89, 810.89, 810.89, 810.89, 810.89
), ref_amount = list(c(500, 500), c(500, 500), c(771.07, 770.26, 
770.26), c(771.07, 770.26, 770.26), c(771.07, 770.26, 770.26), 
    c(771.07, 770.26, 770.26), c(771.07, 770.26, 770.26), c(771.07, 
    770.26, 770.26), c(771.07, 770.26, 770.26), c(771.07, 770.26, 
    770.26), c(771.07, 770.26, 770.26), c(771.07, 770.26, 770.26
    ), c(771.07, 770.26, 770.26), c(771.07, 770.26, 770.26), 
    c(771.07, 770.26, 770.26), 2502.34, c(810.89, 810.89, 810.89
    ), c(810.89, 810.89, 810.89), c(810.89, 810.89, 810.89), 
    c(810.89, 810.89, 810.89), c(810.89, 810.89, 810.89))), row.names = c(NA, 
-21L), class = c("tbl_df", "tbl", "data.frame"))
Hobo
  • 7,536
  • 5
  • 40
  • 50
Vincent
  • 482
  • 6
  • 22
  • 1
    Please use `dput` to show the dataset. Is it a `list` column or not – akrun Jul 03 '18 at 14:09
  • @jogo sorry it's `ref_amount` not `diff_amount` @ akrun I am sorry but I can't seem to use dput. It doesn't paste in the right format in the text editor. and yes it's a `list` column. – Vincent Jul 03 '18 at 14:11
  • My guess is you would need to this with something like `purrr::map`/`purrr::pmap` within `mutate`. – aosmith Jul 03 '18 at 14:11
  • @akrun sorry, my bad, I have made the necessary edits. – Vincent Jul 03 '18 at 14:17

2 Answers2

5

You can't subtract directly from a list column in that way using dplyr. The best way I have found to accomplish the task you are referencing is to use purrr::map. Here is how it works:

data <- data %>% mutate(y = map2(mnt_ope, ref_amount, function(x, y){ x - y }))

Or, more tersely:

data <- data %>% mutate(y = map2(mnt_ope, ref_amount, ~.x - .y))

map2 here applies a two-input function to two vectors (in your case, two columns of a data frame) and returns the result as a vector (which we are using mutate to append back to your data frame).

Hope that helps!

Robert Kahne
  • 164
  • 1
  • 4
  • 1
    Thanks, that works perfectly indeed. Now I guess that this doesn't work in `sparklyr` unfortunately... I'll probably need to expand to several lines when going to `sparklyr` – Vincent Jul 03 '18 at 14:21
0

for each element this works : need to add a loop :

For example 5th data point dt$mnt_ope[5]-unlist(dt$ref_amount[5]) yields :

[1] -0.81  0.00  0.00

with while loop over number of rows (simpler than purrr)

i <-0
while(i < nrow(dt)){
  print(dt$mnt_ope[i]-unlist(dt$ref_amount[i]))
  i = i+1
  }

output :

[1] 0 0
[1] 0 0
[1] 0.00 0.81 0.81
[1] -0.81  0.00  0.00
[1] -0.81  0.00  0.00
[1] -0.81  0.00  0.00
[1] -0.35  0.46  0.46
[1] -0.35  0.46  0.46
[1] -0.35  0.46  0.46
[1] -0.35  0.46  0.46
[1] -0.35  0.46  0.46
[1] 8.88 9.69 9.69
[1] 8.88 9.69 9.69
[1] 8.88 9.69 9.69
[1] 8.88 9.69 9.69
[1] 0
[1] 0 0 0
[1] 0 0 0
[1] 0 0 0
[1] 0 0 0
anuanand
  • 400
  • 1
  • 9