0

Due to errors in my database/export, I have an inconsistant timestamp - sometimes with seconds, sometimes without - how can I easily unify it?

e.g. read_csv with X1 = col_datetime(format = "%d.%m.%Y %H:%M:%S") is not working properly.

My data look like follows:

X1, X2
24.02.2012 22:00, '121'
24.02.2012 22:15:00, '122'
24.02.2012 22:30:00, '124'
24.02.2012 22:45:00, '122'
24.02.2012 23:00, '121'

Thanks!

DaveArmstrong
  • 18,377
  • 2
  • 13
  • 25
Herr Student
  • 853
  • 14
  • 26

1 Answers1

1

anytime can handle it with an additional format entry (as d.m.Y can be confused with m.d.Y we do not add it by default, by entering it you make a more conscious choice).

> times <- c("24.02.2012 22:00", "24.02.2012 22:15:00", "24.02.2012 22:30:00", 
+            "24.02.2012 22:45:00", "24.02.2012 23:00") 
> library(anytime)  
> addFormats("%d.%m.%Y %H:%M:%S")       
> anytime(times)      
[1] "2012-02-24 22:00:00 CST" "2012-02-24 22:15:00 CST" "2012-02-24 22:30:00 CST"
+   "2012-02-24 22:45:00 CST" "2012-02-24 23:00:00 CST"     
>  

So you could overwrite your column, or add a new ones, via

> library(anytime)  
> addFormats("%d.%m.%Y %H:%M:%S")       
> X1$parsed <- anytime(X1$raw_dates)

(assuming you have the data in X in that column) after adding the format as shown. Formats are added to the vector of known formats only for the current session.

Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725