1

Let's assume the following data frame (though my real dataset has much more columns):

df<- data.frame(date=c(01.01.2010,02.01.2010, 03.01.2010), 
x1=c(1,2,4), y1=c(1,2,3), x2=c(1,2,2), y2=c(3,4,4), x3=c(3,3,3), y3=c(3,4,5))

Date          x1  y1  x2  y2  x3  y3
01.01.2010    1   1   1   3   3   3
02.01.2010    2   2   2   4   3   4
03.01.2010    4   3   2   4   3   5

First, I want to calculate the sum of all y columns (every second column) as a new column called y_total and add it to the data frame. Second, I want to calculate new columns for all x columns, where x1_new= x1*(y1/y_total),x2_new= x2*(y2/y_total), x3_new= x3*(y3/y_total). I want to calculate all the x1_new, x2_new, n3_new columns at once as in my real dataset I have up to 60 of these columns. In the specific example it should look like this:

Date          x1  y1  x2  y2  x3  y3  y_total   x1_new   x2_new   x3_new
01.01.2010    1   1   1   3   3   3     7       0.1428   0.4286   1.286
02.01.2010    2   2   2   4   3   4     10      0.4      0.8      1.2
03.01.2010    4   3   2   4   3   5     12      1        0.6666   1.25

Is there a way how I could compute the new x columns for every old x column at once? I am asking this because sometimes I have data frames with 90 x columns. Thanks in advance!

ZayzayR
  • 183
  • 9

1 Answers1

0

Does this work:

> library(dplyr)
> df %>% mutate(y_total = rowSums(select(., starts_with("y"))), x1_new = x1*(y1/y_total))
        date x1 y1 x2 y2 x3 y3 y_total    x1_new
1 01.01.2010  1  1  1  3  3  3       7 0.1428571
2 02.01.2010  2  2  2  4  3  4      10 0.4000000
3 03.01.2010  4  3  2  4  3  5      12 1.0000000
> 
Karthik S
  • 11,348
  • 2
  • 11
  • 25
  • This would work, but I need a more automated way as I have sometimes 60 columns where I would need the perform the same operation. So e.g., there is sometimes a x1_new, x2_new, ..., x60_new and I would like to calculate all these columns in a more concise way. – ZayzayR Oct 20 '20 at 17:50
  • 1
    You could consider using `across()` from the `dplyr` package with `starts_with` to select variables that match a certain prefix. – Eric Oct 20 '20 at 17:55
  • @ZayzayR, have modified my code, can you please check now. – Karthik S Oct 20 '20 at 17:56
  • @KarthikS Thank you but this unfortunately still does not solve my problem entirely. I have edited my question to make it more clear. I nee also the columns for x2_new, x3_new and so on. – ZayzayR Oct 20 '20 at 18:08
  • This could be helpful: [R: mutate over multiple columns to create a new column](https://stackoverflow.com/questions/47464512/r-mutate-over-multiple-columns-to-create-a-new-column) – Eric Oct 20 '20 at 19:13