I'm trying to create several new columns by calculating the previous n values of 2 columns.
Example using Auto
. I made a column that calculates the average Horsepower per 1000lbs for the previous two model-years:
library(ISLR)
library(dplyr)
LaggedAuto <- Auto %>%
arrange(year) %>%
group_by(name) %>%
mutate(L2HPbyWT = 1000*((lag(horsepower)+lag(horsepower,2))/(lag(weight)+lag(weight,2))))
So the (weight, HP) of the 80,81,82 Corollas were (2265, 75), (2350, 75), (2245,70). The new column on the Corolla '82 row would therefore be 1000*(75+75)/(2265+2350) = 32.50
. So it's more or less the average of HP/Weight
, weighted by Weight
.
In my own dataset there's about 12 columns I want to do this for with a lag of 5+. This means if I continue with the current technique, I'll be doing a lot of typing (lag(,1) + lag(,2) + ... + lag(,5)) for 11 more variables making it hard to update/tweak. Is there a way to perform this sort of lagged weighted calc on multiple variables for differing lags in an easy way?
I have a feeling I want to use zoo::rollaplyr based on a previous question of mine and another question I found (Mutate multiple / consecutive columns (with dplyr or base R)) but I'm having issues determining what the function would be. To keep it to this example, could you help me figure out how to also calculate the average of the last two instances of MPG
, displacement
, and acceleration
by 1000lbs weight
?