I have a dataframe with an id, start date, end date and the income and cost values.
table <- data.frame(id = c(1, 2, 3),
start = c("2018-01-01", "2018-02-05", "2018-05-30"),
end = c("2018-01-31", "2018-03-26", "2018-08-31"),
income = c(100, 225, 399),
costs = c(37, 98, 113))
table$start <- as.Date(table$start)
table$end <- as.Date(table$end)
Which results in:
id start end income costs
1 2018-01-01 2018-01-31 100 37
2 2018-02-05 2018-03-26 225 98
3 2018-05-30 2018-08-31 399 113
Just like this question, some of these time periods span n months and I would like to aggregate income and cost by month. For those amounts that relate to a period which spans two, three or more months I would like to linearly apportion them between the two, three or n months.
The problem is that I also would like to keep the id, and perform the operations over two variables (not one like the question asked before), which complicates the whole thing.
What I'm expecting to get is the following table:
id date income costs
1 2018-01 100 37
2 2018-02 108 47.04
2 2018-03 117 50.96
3 2018-05 8.489362 2.404255
3 2018-06 127.340426 36.063830
3 2018-07 131.585106 37.265957
3 2018-08 131.585106 37.265957
I tried using rbindlist over a list of dataframes created by the id, and the following function:
explode <- function(start, end, income) {
dates <- seq(start, end, "day")
n <- length(dates)
rowsum(rep(income, n) / n, format(dates, "%Y-%m"))
}
Map(explode, table$start, table$end, table$income)
But of course it returns only the rowsum values inside and unnamed list.
Any help will be very appreciated. Thanks!