how to convert yearly time data in to specific hourly interval data in R

Question

i have a time series dataset containing 10000 rows with 1 year of data which looks like this

 2012-01-01 06:23:02    c   d10
 2012-01-01 08:12:12    d   d2
 ...........................
 2012-12-31 08:22:24    s   d5

it has 3 fields

  date_time, category1, category2 where category1 contains 4 categorical values (c,v,d,s) category2 contains 10 categorical values(d1....d10).

i want to calculate the individual count of all the categorical values c,v,d,s with respect each categorical values d1......d10. it should be like how many c,v,d,s present for d1,d2....d10 with respect to the time frame 0-1, 1-2, .... 22-23

how to represent the above data in a time series starting from 1-2, 2-3, 3-4,.....23-24

sample output should be like this

               1-2               2-3   3-4   ........23-24

d1 c=2,d=3,v=3s=4

d2 c=3 d=3,v=2,s=2
..................

d10
i have tried using lubridate,data.table packages but couldn't find out the expected solution

@mubarak In the expected output, are those `category c=2, d=3` row.names? — akrun, Nov 10 '14 at 10:50
hai @DavidArenburg again i edited the question, i think now you can identify my problem. — mubarak, Nov 10 '14 at 12:31
Your desired output is a mess, I don't understand it, did @akruns answer help? — David Arenburg, Nov 10 '14 at 12:39
@mubarak As mentioned by David Arenburg, the desired output is still not clear. — akrun, Nov 10 '14 at 13:56
@mubarak It is possible to paste those elements from the `res2` to the format you showed. But, it will be lot messy. — akrun, Nov 10 '14 at 15:00

akrun · Accepted Answer · 2014-11-10T13:59:57.087

Not clear about the expected result. May be this helps:

indx <- with(dat1, as.numeric(format(as.POSIXct(cut(date_time,
                                          breaks='hour')),'%H')))
dat1$indx1 <- interaction(indx, indx+1, sep="-", 
                           lex.order=TRUE, drop=TRUE)
dat1$date_time <- as.character(dat1$date_time)
library(reshape2)
res1 <- dcast(dat1, category1+category2~indx1, value.var='date_time')
res1[,-(1:2)] <- lapply(res1[,-(1:2)], as.POSIXct)

 head(res1,2)
 #  category1 category2  0-1                 1-2  2-3  3-4  4-5  5-6  6-7  7-8
 #1        c1        d1 <NA> 2012-01-03 01:43:02 <NA> <NA> <NA> <NA> <NA> <NA>
 #2        c1       d10 <NA>                <NA> <NA> <NA> <NA> <NA> <NA> <NA>
 #   8-9                9-10 10-11               11-12 12-13 13-14 14-15 15-16
 #1 <NA> 2012-01-01 09:13:02  <NA>                <NA>  <NA>  <NA>  <NA>  <NA>
 #2 <NA> 2012-01-02 09:43:02  <NA> 2012-01-02 11:03:02  <NA>  <NA>  <NA>  <NA>
 #  16-17 17-18 18-19 19-20 20-21 21-22 22-23 23-24
 #1  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>
 #2  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>

Update

If you want the counts

res2 <- dcast(dat1, category1+category2~indx1, value.var='date_time', length)
res2[1:3,1:3]
#   category1 category2 0-1
#1        c1        d1   0
#2        c1       d10   0
#3        c1       d11   0

data

set.seed(24)
dat1 <- data.frame(date_time=seq(as.POSIXct('2012-01-01 06:23:02', 
      '%Y-%m-%d %H:%M:%S'), length.out=300, by='10 min'), category1 =
 sample(paste0('c',1:20), 300, replace=TRUE), category2 = 
sample(paste0('d', 1:20), 300, replace=TRUE))

thank u @akrun for the suggestion. but thats not the output i expected. i have modified the question for better understanding....pls help me — mubarak, Nov 10 '14 at 12:53

how to convert yearly time data in to specific hourly interval data in R

1 Answers1

Update

data