1
          DATETIME      MAX.TEMP MIN.TEMP NEW.PRCP NEW.SNOW
18791 1999/03/20 06:00      0.4     -2.0        0      0.0
18911 1999/03/20 18:00     11.7     -1.5        0      0.0
19031 1999/03/21 06:00      6.9      2.0       -1      0.1
19151 1999/03/21 18:00     12.3      2.4       -1      0.0
19271 1999/03/22 06:00      2.6     -1.1        1      0.0
19391 1999/03/22 18:00      5.4     -2.3        0      0.0
19511 1999/03/23 06:00      1.1     -3.9        1      0.1
19631 1999/03/23 18:00      7.1     -3.1        0      0.1
19751 1999/03/24 06:00      1.6     -2.6        0      0.1
19871 1999/03/24 18:00      7.5     -0.8        1      0.1

I would like to get the daily mean for columns: MAX.TEMP, MIN.TEMP Also I would like to get the daily sum for columns: NEW.PRCP, NEW.SNOW

I have been trying to use data.table similar to an earlier question here

I started with trying to mimic the comment provided and began trying to just get the mean for MAX.TEMP:

setDT(new_Tbl)[, .(DATETIME = DATETIME[1L], MAX.TEMP = mean(MAX.TEMP)), 
           by = .(indx = substr(DATETIME,12,13) == '06'))]

Which returns errors for unexpected ] and ) and could not find function"."

I am assuming there are shortcuts in the syntax that I do not follow. I appreciate the help and patience.

Community
  • 1
  • 1
Bret Shandro
  • 61
  • 1
  • 9
  • 1
    So what issues did you encounter in your attempt? – nrussell Oct 13 '15 at 17:49
  • Showing your attempt in the original question makes it clearer where you need help. Just fyi for future questions. You already have (what looks to me like) a good answer to this question, below. – Frank Oct 13 '15 at 18:20
  • Thanks for adding it. General advice with data.table. If you want to see what a column looks like, just put it on its own in the `j` part of `DT[i,j,by]`. So, for example, you can look at `DT[,substr(DATETIME,12,13)]` and `DT[, substr(DATETIME,12,13) == '06']` – Frank Oct 13 '15 at 18:35
  • 1
    @Frank, I will keep in mind for the future. I added an attempt and before I finished the edit the the answer was below. Extremely helpful. Thanks – Bret Shandro Oct 13 '15 at 18:41

1 Answers1

1
library(data.table)
setDT(df)[, `:=`(MAX.TEMP = mean(MAX.TEMP), MIN.TEMP = mean(MIN.TEMP), 
                 NEW.PRCP = sum(NEW.PRCP), NEW.SNOW = sum(NEW.SNOW)), as.IDate(DATETIME)]
# 
#             DATETIME MAX.TEMP MIN.TEMP NEW.PRCP NEW.SNOW
#  1: 1999/03/20 06:00     6.05    -1.75        0      0.0
#  2: 1999/03/20 18:00     6.05    -1.75        0      0.0
#  3: 1999/03/21 06:00     9.60     2.20       -2      0.1
#  4: 1999/03/21 18:00     9.60     2.20       -2      0.1
#  5: 1999/03/22 06:00     4.00    -1.70        1      0.0
#  6: 1999/03/22 18:00     4.00    -1.70        1      0.0
#  7: 1999/03/23 06:00     4.10    -3.50        1      0.2
#  8: 1999/03/23 18:00     4.10    -3.50        1      0.2
#  9: 1999/03/24 06:00     4.55    -1.70        1      0.2
# 10: 1999/03/24 18:00     4.55    -1.70        1      0.2

Each column is created within the `:=`(...) operator. Also it is all grouped by day without having to explicitly create a new column as.IDate(DATETIME).

Pierre L
  • 28,203
  • 6
  • 47
  • 69