1

I try to create a new column named "Day_period" in my data frame "df.data" which takes the following values: "Early Morning" if the values of column "Times" are between "05:00:00" and "08:59:00", using the "chron" package:

require(chron)

early.morning.start <- times("05:00:00")
early.morning.stop <- times("08:59:00")

df.data$Day_period[which(df.data$Times >= early.morning.start && df.data$Times <=          
early.morning.stop)] <- "Early Morning"

but the code above doesn't seem to do the job.

Cœur
  • 37,241
  • 25
  • 195
  • 267
Perukas
  • 11
  • 1
  • Forgot to mention that the values in column df.data$Times is already a times() object – Perukas Jan 01 '15 at 13:21
  • This isn't reproducible since `df.data` has not been provided. If DF is a small subset of rows that still exhibits the problem then show the result of `dput(DF)` in your question. – G. Grothendieck Jan 01 '15 at 13:50
  • I don't think you want to be using `&&` in a vector subset as it only returns one value, TRUE or FALSE. – Rich Scriven Jan 02 '15 at 07:23

1 Answers1

3

Here is one way. Given the comments of the OP, it seems that using cut is a good approach here. Since there is no reproducible example, I created a small sample to demonstrate the function. Since you have a large data set, I think you want to update your R and use the data.table package. If you stick to old versions of R, the transform approach would be your choice.

# Create a sample data
mydf <- data.frame(id = 1:7,
                   time = c("01:00:00", "05:30:00", "10:00:00",
                            "14:00:00", "17:00:00", "20:00:00", "23:00:00"),
                   stringsAsFactors = FALSE)

#  id     time
#1  1 01:00:00
#2  2 05:30:00
#3  3 10:00:00
#4  4 14:00:00
#5  5 17:00:00
#6  6 20:00:00
#7  7 23:00:00

library(chron)
library(dplyr)
library(data.table)

# Convert character to times
mydf$time <- times(mydf$time)

# Base R approach
transform(mydf,
          day_period = cut(time,
                       breaks = times(c("00:00:00", "05:00:00", "09:00:00",
                                        "13:00:00", "17:00:00", "21:00:00", "23:59:00")),
                       labels = c("Late night", "Early morning", "Late morning",
                                  "Early afternoon", "Late afternoon", "Evening")))
# dplyr approach
mutate(mydf,
       day_period = cut(time,
                        breaks = times(c("00:00:00", "05:00:00", "09:00:00",
                                         "13:00:00", "17:00:00", "21:00:00", "23:59:00")),
                        labels = c("Late night", "Early morning", "Late morning",
                                   "Early afternoon", "Late afternoon", "Evening")))

# data.table approach
setDT(mydf)[, day_period := cut(time,
                                breaks = times(c("00:00:00", "05:00:00", "09:00:00",
                                                 "13:00:00", "17:00:00", "21:00:00",
                                                 "23:59:00")),
                                labels = c("Late night", "Early morning", "Late morning",
                                           "Early afternoon", "Late afternoon", "Evening"))][]

#   id     time      day_period
#1:  1 01:00:00      Late night
#2:  2 05:30:00   Early morning
#3:  3 10:00:00    Late morning
#4:  4 14:00:00 Early afternoon
#5:  5 17:00:00 Early afternoon
#6:  6 20:00:00  Late afternoon
#7:  7 23:00:00         Evening
jazzurro
  • 23,179
  • 35
  • 66
  • 76
  • Thnanks!!Actually I didnt mention that I have a data frame with measurments for every minute of a day for 4 years... and I want to add a column with values that correspond to day periods...(I 'm very sorry about that)...i.e if a row has a time between "05:00:00" and "08:59:00" the Day_period column has the "Early Morning" value, if it's between "09:00:00" and "12:59:00" has the "Late Morning" value etc. – Perukas Jan 01 '15 at 13:59
  • @jazzurro I get some errors with `as_data_frame`. `Error: could not find function "as_data_frame"`. I am using `dplyr_0.3`, but installed the devel version some time back. Did you installed it recently? – akrun Jan 01 '15 at 14:45
  • I cant install dplyr package beacause I 'm using R version 3.0.2 and it is not available. Instead I use plyr package. I already have a data frame so I didn't use the as_data_frame function. – Perukas Jan 01 '15 at 14:49
  • @Perukas Thanks for your comment. Would you please add a mini version of your actual data from next time? In that way, you can receive the right answer for you. That will also help SO users provide good suggestions as well. Since you have a large data set, it would be good to update your R and use the `data.table` package. I will update my suggestion shortly. – jazzurro Jan 02 '15 at 01:47
  • @akrun I am sorry that I did not mention that function is available in the dev version. I think I updated `dplyr` last month. This function is one of the new ones. This is faster than the regular `data.frame()`. – jazzurro Jan 02 '15 at 01:49
  • @jazzurro May be it is time for me to reinstall the `dev` version. Thanks for the reply. – akrun Jan 02 '15 at 03:41
  • @akrun If you have not updated your dev version, you may see a couple of new functions including `right_join`, `full_join`, and `bind_rows`, which is a new version of `rbind_all`. My brain is still loosened due to the holiday season. I gotta catch up with you! – jazzurro Jan 02 '15 at 03:46
  • @jazzurro Many thanks about these. Suppose if I change the arguments in `left_join(df1, df2)`, to `left_join(df2, df1)`, would that be the `right_join`? – akrun Jan 02 '15 at 03:49
  • @akrun Yeah the same outcome. I ran a demo last month. As far as I remember, I think the order of columns is different. – jazzurro Jan 02 '15 at 03:51
  • @jazzuro Thank you, and next time I will be more specific, I'm sorry if I wasted your time. Thanks again. – Perukas Jan 02 '15 at 07:28
  • @Perukas You did not waste my time at all! I am still learning from all questions. If you can be specific, you are likely to get good suggestions. :) – jazzurro Jan 02 '15 at 07:31