I have 3848 rows of POSIXct data - stop times of bike trips in the month of April. As you can see, all of the data is in POSIXct format and is within the range of the month of April.
length(output2_stoptime)
[1] 3848
head(output2_stoptime)
[1] "2015-04-01 17:19:27 EST" "2015-04-02 07:26:06 EST" "2015-04-08 10:09:37 EST"
[4] "2015-04-12 20:08:00 EST" "2015-04-13 17:53:11 EST" "2015-04-14 07:17:34 EST"
class(output2_stoptime)
[1] "POSIXct" "POSIXt"
range(output2_stoptime)
[1] "2015-04-01 00:34:29 EST" "2015-04-30 20:49:22 EST"
Sys.timezone()
[1] "EST"
However, when I try converting this into a table of stop times per day, I get 4 dates that are converted as the 1st of May. I thought this might be occurring due to the different system timezone as I am located in Europe at the moment, but even after setting the timezone to EST, the problem persists. For example:
by_day_output2 = as.data.frame(as.Date(output2_stoptime), tz = "EST")
colnames(by_day_output2)[1] = "SUM"
movements_Apr = as.data.frame(table(by_day_output2$SUM))
colnames(movements_Apr)[1] = "DATE"
tail(movements_Apr)
DATE Freq
26 2015-04-26 96
27 2015-04-27 125
28 2015-04-28 145
29 2015-04-29 151
30 2015-04-30 99
31 2015-05-01 4
Why are the four dates converting improperly when the time zones of the data and the system match? None of the data falls within May.