I am trying to come up with a way to do a projection within a data frame, preferably dplyr
.
library("dplyr")
set.seed(1)
df0 <- data_frame(t = 0:5,
r = c(NA, rnorm(n = 5, mean = 1, sd = 0.1)),
P = c(100, rep(x = NA, times = 5)))
df0
# Source: local data frame [6 x 3]
#
# t r P
# (int) (dbl) (dbl)
# 1 0 NA 100
# 2 1 0.9373546 NA
# 3 2 1.0183643 NA
# 4 3 0.9164371 NA
# 5 4 1.1595281 NA
# 6 5 1.0329508 NA
I am little stuck as to how to run the projection model recursively...
df0 %>%
mutate(P = ifelse(test = is.na(P), yes = lag(P)*r, no = P))
# Source: local data frame [6 x 3]
#
# t r P
# (int) (dbl) (dbl)
# 1 0 NA 100.00000
# 2 1 0.9373546 93.73546
# 3 2 1.0183643 NA
# 4 3 0.9164371 NA
# 5 4 1.1595281 NA
# 6 5 1.0329508 NA
Does anyone know if this is possible?
I have in mind to do this across multiple regions using group_by
. The data frame will be quite large, hence the preference for a speedy solution on something other than a data.frame
type object.
The only solution I can think of thus far uses a for
loop...
for(i in 1:5)
df0 <- df0 %>% mutate(P = ifelse(is.na(P), yes = lag(P)*r, no = P))
df0
# Source: local data frame [6 x 3]
#
# t r P
# (int) (dbl) (dbl)
# 1 0 NA 100.00000
# 2 1 0.9373546 93.73546
# 3 2 1.0183643 95.45685
# 4 3 0.9164371 87.48020
# 5 4 1.1595281 101.43575
# 6 5 1.0329508 104.77814
... which can lead to memory problems with my big data set and given all I have read about for
loops in R, is probably not the best solution available.
EDIT
Some nice answers using purrr
to a very similar questions but for simulations. Written up in a blog post.