0

I have a data.frame with two columns. Both, dates as characters:

a <- c("01-01-2007 00:00:00", "01-02-2007 00:00:00", "03-05-2007 00:00:00", "31-08-2007 00:00:00")
b <- c("01-01-1960 01:25:30", "01-01-1960 1:05:36", "01-01-1960 02:25:59", "01-01-1960 1:20:30")
df <- as.data.frame(cbind(a,b))
df
                    a                   b
1 01-01-2007 00:00:00 01-01-1960 01:25:30
2 01-02-2007 00:00:00  01-01-1960 1:05:36
3 03-05-2007 00:00:00 01-01-1960 02:25:59
4 31-08-2007 00:00:00  01-01-1960 1:20:30

The first column have dates that I need, but time is incorrect. Time is correct in the second column, but dates aren't. The second column also have the problem that, in some rows hours have only one digit.

What I need is a merge between two columns in a time format that I can use to represent counts frequency by time.

I've tried a lot of different combinations to merge both columns but I always get an error. as.Date() don't keep me time, and I can't apply as.POSIXct in the data.frame.

I would appreciate some help.

Thanks

Xbel
  • 735
  • 1
  • 10
  • 30

3 Answers3

2

Using regex to get the correct parts together (just assumes the space in the middle):

df$good_string = paste(gsub(pattern = " .*", "", x = df$a), gsub(pattern = ".* ", "", df$b), sep = " ")
df$parsed_date = as.POSIXct(df$good_string, format = "%d-%m-%Y %H:%M:%S")
df[3:4]
#           good_string         parsed_date
# 1 01-01-2007 01:25:30 2007-01-01 01:25:30
# 2  01-02-2007 1:05:36 2007-02-01 01:05:36
# 3 03-05-2007 02:25:59 2007-05-03 02:25:59
# 4  31-08-2007 1:20:30 2007-08-31 01:20:30
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
1

Using base functions, we can do this:

a = as.POSIXct(a, '%d-%m-%Y %H:%M:%S', tz = "GMT")
b = as.POSIXct(b, '%d-%m-%Y %H:%M:%S', tz = "GMT")
df <- data.frame(a,b)
df$merged = paste(strftime(df$a, '%d-%m-%Y', tz = "GMT"), strftime(df$b, '%H:%M:%S', tz = "GMT"))
df

# 
#            a                   b              merged
# 1 2007-01-01 1960-01-01 01:25:30 01-01-2007 01:25:30
# 2 2007-02-01 1960-01-01 01:05:36 01-02-2007 01:05:36
# 3 2007-05-03 1960-01-01 02:25:59 03-05-2007 02:25:59
# 4 2007-08-31 1960-01-01 01:20:30 31-08-2007 01:20:30
dww
  • 30,425
  • 5
  • 68
  • 111
0

Try using the lubridate package:

library(lubridate)

a <- c("01-01-2007 00:00:00", "01-02-2007 00:00:00", "03-05-2007 00:00:00", "31-08-2007 00:00:00")
b <- c("01-01-1960 01:25:30", "01-01-1960 1:05:36", "01-01-1960 02:25:59", "01-01-1960 1:20:30")
df <- as.data.frame(cbind(a,b))
df

hr <- hour(parse_date_time(b, "dmy HMS"))
minu <- minute(parse_date_time(b, "dmy HMS"))
sec<- second(parse_date_time(b, "dmy HMS"))

getDate <- as_date(parse_date_time(a, "dmy HMS"))
getTime <- paste(hr, minu, sec, sep = ":")

as_datetime(paste(getDate, getTime))
conrad-mac
  • 843
  • 10
  • 15