1

I am trying to use the apply.daily/weekly/monthly functions with xts in R, but I need to have the apply function work on subsets at a time. For example,

x=xts(data.frame(value=1:100,code=rep(1:5,20)),
                 seq(as.Date('2011-01-01'),by=1,length.out=100)) 

Step 1: I'd like to roll-up by week AND "code", so I'd have something like

ROW 1: Week = Week 1, Code = 1, Sum(all entries that have a code of 1 that fall in week 1)
ROW 2: Week = Week 1, Code = 2, Sum(all entries that have a code of 2 that fall in week 1) ... ROW 70: Week = Week 10, Code = 1, Sum(all entries that have a code of 1 that fall in week 10)

Step 2: I'd like the same number of rows for each week, because ultimately I want a matrix--one row per code and one column per week. I'd prefer not to create separate week variables as the first answer suggests because I'm going to need to cut this again by month, day, hour, minute, and maybe even custom time duration. I'm happy to bypass the first step, because that's only an intermediate output.

Unfortunately in my real data I can't subset manually because I have 10,000+ "codes" and 53M rows.

Karthik
  • 31
  • 3
  • With 10,000 codes, do you expect 10,000 output rows for each code for each week, i.e. even if some of those codes were not found in a given week? Or do you just want output rows for when there was at least one occurrence of a code. (So you might get 15 rows for week 1, 20 rows for week 2, 12 rows for week 3, etc.) – Darren Cook Jun 30 '15 at 08:23
  • I'd like the same number of rows for each week, because ultimately I want a matrix--one row per code and one column per week. I'd prefer not to create separate week variables as the first answer suggests because I'm going to need to cut this again by month, day, hour, minute, and maybe even custom time durations. – Karthik Jul 01 '15 at 13:39
  • I'd change the question to describe the actual format you want. I suspect a matrix output might be easier; it will certainly be a different approach. – Darren Cook Jul 01 '15 at 13:59

1 Answers1

0
If your data is

x <- data.frame(value=1:100,code=rep(1:5,20), date = seq(as.Date('2011-01-01'),by=1,length.out=100))

I'd try

library(dplyr)

# make a new column that includes the week of the date
x$date <- as.POSIXct(x$date)
x$week <- as.character(strftime(x$date,format="%W"))

result <- x %>% group_by(week,code) %>% summarize(sum = sum(value))


#    week code sum
# 1    00    1   1
# 2    00    2   2
# 3    01    1   6
# 4    01    2   7
# 5    01    3  11
# 6    01    4  13
# 7    01    5   5
# 8    02    1  27
# 9    02    2  12
# 10   02    3  13
# ..  ...  ... ...
rmuc8
  • 2,869
  • 7
  • 27
  • 36