I am having problems merging / joining data for the coming daylight savings shift. My time-vector d
is supposed to be the controlling time-vector, so when I join with data with missing holes I just get NA values. This normally works brillantly. However, during the coming '2015-10-25 02:00:00'
it goes horribly wrong.
Data example:
d <- seq.POSIXt(from = as.POSIXct("2015-10-25 00:00:00", "%Y-%m-%d %H:%M:%S", tz = ""),
to = as.POSIXct("2015-10-25 23:00:00", "%Y-%m-%d %H:%M:%S", tz = ""), by = "hour")
df1 <- data.frame(Date = d, value1 = 1:25)
df2 <- data.frame(Date = as.POSIXct(format(d, "%Y-%m-%d %H:%M:%S"), tz = ""), value2 = 26:50)
require(dplyr)
df <- left_join(df1, df2, by = "Date")
df <- merge(df1, df2, by = "Date", all.x = TRUE)
Both left_join
and merge
gives wrong results, and I am not sure what goes wrong. Well, I can see R has no idea how to handle the two repeated hours - and that is completely understandable. Both time series are POSIXct
, but there is clearly some information I am missing? How can you handle this? I would prefer a base R-solution.
It gets exponentially worse, if you need to do even more joins from different data-sets. I need to join 7 and it just gets worse and worse.
The correct result is:
result <- data.frame(Date = d, var1 = df1[, 2], var2 = df2[, 2])