I've created a function which I am trying to apply to a dataset using pmap. The function I've created amends some columns in a dataset. I want the amendment that's applied to the two columns to carry over to the 2nd and subsequent iterations of pmap.
Reproducible example below:
library(tidyr)
library(dplyr)
set.seed(1982)
#create example dataset
dataset <- tibble(groupvar = sample(c(1:3), 20, replace = TRUE),
a = sample(c(1:10), 20, replace = TRUE),
b = sample(c(1:10), 20, replace = TRUE),
c = sample(c(1:10), 20, replace = TRUE),
d = sample(c(1:10), 20, replace = TRUE)) %>%
arrange(groupvar)
#function to sum 2 columns (col1 and col2), then adjust those columns such that the cumulative sum of the two columns
#within the group doesn't exceed the specified limit
shared_limits <- function(col1, col2, group, limit){
dataset <- dataset
dataset$group <- dataset[[group]]
dataset$newcol <- dataset[[col1]] + dataset[[col2]]
dataset <- dataset %>% group_by(groupvar) %>% mutate(cumulative_sum=cumsum(newcol))
dataset$limited_cumulative_sum <- ifelse(dataset$cumulative_sum>limit, limit, dataset$cumulative_sum)
dataset <- dataset %>% group_by(groupvar) %>% mutate(limited_cumulative_sum_lag=lag(limited_cumulative_sum))
dataset$limited_cumulative_sum_lag <- ifelse(is.na(dataset$limited_cumulative_sum_lag),0,dataset$limited_cumulative_sum_lag)
dataset$adjusted_sum <- dataset$limited_cumulative_sum - dataset$limited_cumulative_sum_lag
dataset[[col1]] <- ifelse(dataset$adjusted_sum==dataset$newcol, dataset[[col1]],
pmin(dataset[[col1]], dataset$adjusted_sum))
dataset[[col2]] <- dataset$adjusted_sum - dataset[[col1]]
dataset <- dataset %>% ungroup() %>% dplyr::select(-group, -newcol, -cumulative_sum, -limited_cumulative_sum, -limited_cumulative_sum_lag, -adjusted_sum)
dataset
}
#apply function directly
new_dataset <- shared_limits("a", "b", "groupvar", 25)
#apply function using a separate parameters table and pmap_dfr
shared_limits_table <- tibble(col1 = c("a","b"),
col2 = c("c","d"),
group = "groupvar",
limit = c(25, 30))
dataset <- pmap_dfr(shared_limits_table, shared_limits)
In the example above the pmap function applies the shared limit to columns "a" and "c" and returns an adjusted dataset as the first element in the list. It then applies the shared limit to columns "b" and "d" and returns this as the second element in the list. However the adjustments that have been made to "a" and "c" are now lost.
Is there any way of storing the adjustments that are made to each column as we progress through each iteration of pmap?