2

Context

As a followup to Exclude specific time periods in R

str(databank[[1]])
'data.frame':   987344 obs. of  13 variables:
 $ Date      : Factor w/ 43 levels "01/03/2017","02/03/2017",..: 17 17 17 17 17 17 17 17 17 17 ...
 $ Time      : Factor w/ 23400 levels "01:00:00 PM",..: 15344 15343 15342 15341 15340 15339 15338 15337 15336 15335 ...
 $ Bar.      : Factor w/ 63033 levels "","1/63032","10/63032",..: 58929 1 1 1 1 1 1 1 58928 1 ...
 $ Bar.Index : int  0 NA NA NA NA NA NA NA -1 NA ...
 $ Tick.Range: int  5 NA NA NA NA NA NA NA 0 NA ...
 $ Open      : num  16.9 NA NA NA NA ...
 $ High      : num  16.9 NA NA NA NA ...
 $ Low       : num  16.9 NA NA NA NA ...
 $ Close     : num  16.9 NA NA NA NA ...
 $ Vol       : num  900 0 0 0 0 0 0 0 100 0 ...
 $ MACDHist  : num  -137 NA NA NA NA ...
 $ MACD      : num  -225 NA NA NA NA ...
 $ MACDSig   : num  -87.9 NA NA NA NA ...

head(databank[[1]])
Date        Time        Bar. Bar.Index Tick.Range  Open  High  Low Close
1 12/04/2017 10:45:43 AM 63032/63032         0          5 16.95 16.95 16.9 16.95
2 12/04/2017 10:45:42 AM                    NA         NA    NA    NA   NA    NA
3 12/04/2017 10:45:41 AM                    NA         NA    NA    NA   NA    NA
4 12/04/2017 10:45:40 AM                    NA         NA    NA    NA   NA    NA
5 12/04/2017 10:45:39 AM                    NA         NA    NA    NA   NA    NA
6 12/04/2017 10:45:38 AM                    NA         NA    NA    NA   NA    NA
  Vol MACDHist    MACD MACDSig
1 900  -136.77 -224.68  -87.91
2   0       NA      NA      NA
3   0       NA      NA      NA
4   0       NA      NA      NA
5   0       NA      NA      NA
6   0       NA      NA      NA

Problem

I attempted to implement the top answer's lubridate method using:

test1 <- databank[[1]][hour(d) == 9 & minute(d) > 30,] 

But it only returns times from 9:30:00 to 9:59:59, to get times from 9:35:00 to 15:55:00...

Things I tried

test1 <- databank[[1]][hour(d) == 9 & minute(d) > 30, hour(d) == 15 & minute(d) < 55]

and

test1 <- databank[[1]][hour(d) == 9 & minute(d) > 30 & hour(d) == 15 & minute(d) < 55, ] 

but the former returns an empty table with ~79,000 blank rows (only has the entry number) and no headers and the latter, an empty table with just the headers. I thought that it is an issue because my date and times are not in POSIX but ran into trouble into converting them...

What am I missing?

Dave2e
  • 22,192
  • 18
  • 42
  • 50
Robert Tan
  • 634
  • 1
  • 8
  • 21
  • The `lubridate` solution in http://stackoverflow.com/a/12891857/3817004 was wrong as it it only returned time stamps between 7:30pm and 7:59pm. – Uwe Apr 18 '17 at 08:34
  • Good to know, judging by the checkmark and lack of dissenting opinion I deemed it as canon, wrongly so – Robert Tan Apr 18 '17 at 20:59
  • Good idea to show the result of `str(databank[[1]])`. This exhibits that `Date` and `Time` are factors. But, how did you create `d` from the `Date` and `Time` columns of `databank[[1]]`? – Uwe Apr 19 '17 at 07:15
  • @UweBlock Thanks, I adopted it as standard practice. It was a while ago but I believe I did `d <- hms(databank[[1]]$Time)` or something close – Robert Tan Apr 19 '17 at 20:08

3 Answers3

0

After encountering the | operand in other SO answers, I implemented it and got this:

test1 <- databank[[1]][(hour(d) == 9 & minute(d) > 34) | (hour(d) == 10 & minute(d) > 0) | (hour(d) == 11 & minute(d) > 0) | (hour(d) == 12 & minute(d) > 0) | (hour(d) == 01 & minute(d) > 0) | (hour(d) == 02 & minute(d) > 0) | (hour(d) == 03 & minute(d) <= 54), ]

An ugly solution given limited knowledge, but it works.

Per Uwe Block's suggestion:

databank[[1]][hour(d) == 9 & minute(d) >= 35) | hour(d) %in% 10:14 | (hour(d) == 15 & minute(d) < 55]

I more than welcome seeing a much more elegant solution!

Robert Tan
  • 634
  • 1
  • 8
  • 21
  • 1
    If you want to select times between 9:35:00 and 15:55:00 (note the 24 hour clock!) for each day you need to modify your solution to get correct results. Please try `databank[[1]][hour(d) == 9 & minute(d) >= 35) | hour(d) %in% 10:14 | (hour(d) == 15 & minute(d) < 55]` As it is currently written, your selection stops at 12:59 and picks the early morning hours from 01:00 to 03:54. – Uwe Apr 18 '17 at 08:54
  • @UweBlock thanks for the colon trick. At first, I was afraid that using 01, 02 ... would return morning hours, however, per testing, it returned correctly. It could be due to the fact that the dates are factors. – Robert Tan Apr 18 '17 at 20:58
  • Yes, this works just by luck because of the factors and because your data don't cover the whole day. The first factor level in your data is `"01:00:00 PM"`. If `"01:00:00 AM"` would be included in your data it would be the first factor level, `"01:00:00 PM"`the second, followed by `"01:00:01 AM"` and `"01:00:01 PM", etc, alternating between AM and PM. – Uwe Apr 19 '17 at 07:32
0

Your question is not very clear on what your starting conditions are. To work with just time (without an associated date) the chron package is handy.

#create a random time sequnce
h<-rep( c(1:22), each=2)
m <- c(1:44)
randomtimes<-paste(h, m, "00", sep=":")

library(chron)
#convert the time strings in time objects
samplet<- times(randomtimes)

#perform comparison and subset
samplet[(samplet > times("9:30:00") & samplet< times("15:55:00"))]
Dave2e
  • 22,192
  • 18
  • 42
  • 50
0

The data sample databank[[1]] given in the actual question (here) is different from the situation in the referenced question Exclude specific time periods in R (there):

  1. The timestamp there had already been converted to class POSIXct while the Dateand Timehere are in separate factor columns.
  2. Here, Time uses a 12 hour clock with AM/PM indicator.

It might be possible to work with the factor levels of Time but this is unreliable. So, the safest way IMHO is to create a POSIXct timestamp from the Date and Time columns and to select by time of day (without date) later on.

Add time stamp

databank[[1L]]$datetime <- 
  with(databank[[1L]], as.POSIXct(paste(Date, Time), "%d/%m/%Y %I:%M:%S %p", tz = "GMT"))

Add time of day

For convenience, a time_of_day (without date) column is added as character:

databank[[1L]]$time_of_day <- 
  with(databank[[1L]], format(datetime, "%T"))

databank[[1L]][, c("Date", "Time", "datetime", "time_of_day")]
#         Date        Time            datetime time_of_day
#1: 12/04/2017 10:45:43 AM 2017-04-12 10:45:43    10:45:43
#2: 12/04/2017 10:45:42 AM 2017-04-12 10:45:42    10:45:42
#3: 12/04/2017 10:45:41 AM 2017-04-12 10:45:41    10:45:41
#4: 12/04/2017 10:45:40 AM 2017-04-12 10:45:40    10:45:40
#5: 12/04/2017 10:45:39 AM 2017-04-12 10:45:39    10:45:39
#6: 12/04/2017 10:45:38 AM 2017-04-12 10:45:38    10:45:38
#7: 12/04/2017 10:45:00 PM 2017-04-12 22:45:00    22:45:00

Note that I've added a PM time for illustration.

Select rows by time of day range

databank[[1L]][time_of_day >= "09:35:00" & time_of_day < "15:55:00", ]
Community
  • 1
  • 1
Uwe
  • 41,420
  • 11
  • 90
  • 134