How to unify Date & Time format when there inconsistency exists across a single column

Question

I am working on a data frame that has a column that uses the date and time format as "mm/dd/yy HH:MM" for some observations and "yyyy/mm/dd HH:MM:SS" format for other observations, of course, this inconsistency results in errors or NA returns in my code, how can I unify the whole column so my calculations are not interrupted by this inconsistency?

enter image description here

@TomHoel i am using this line but the results is a column with all NA values mutate(start_date_corrected=as.Date(started_at, format="%m/%d/%Y %H:%M:%S")) — admo7000, Jul 09 '22 at 18:46
Can you share a sample of your data by using `dput(your_data)` ? — shafee, Jul 09 '22 at 18:47
@shafee it is available in the embedded pic click on (enter image description here) — admo7000, Jul 09 '22 at 18:57
Its always a good idea to provide a sample of the data, so that we can help you. Providing a screenshot of the data is not helpful. If you are not being helpful, its a bit difficult for us to help you. : ) — shafee, Jul 09 '22 at 19:10

TarJae · Accepted Answer · 2022-07-09T19:12:28.750

Update step by step:

We have a data frame with for columns each of them have character type columns (you can check this with str(df)
In order to change the format from character to datetime in all four columns we use mutate(across(1:4, ...
What we want is that in each column 1:4 the character type is changed to datetime
this can be done with the function parse_date_time from lubridate package
Here we use ~ to indicate an anonymous function
the . indicates column 1-4.
and most important the argument c("ymd_HMS", "mdy_HM") which gives the order of the different formats of the date columns!

We could use parse_date_time() from lubridate package. Important is the argument c("ymd_HMS", "mdy_HM"). Here you define the occurence of the different formats:

and note to use HMS , because:

hms, hm and ms usage is defunct, please use HMS, HM or MS instead. Deprecated in version '1.5.6'.

library(dplyr)
library(lubridate)

df %>% 
  mutate(across(1:4, ~parse_date_time(., c("ymd_HMS", "mdy_HM"))))

  started_at          ended_at            started_at_1        ended_at_1         
  <dttm>              <dttm>              <dttm>              <dttm>             
1 2021-10-29 17:42:36 2021-10-29 18:00:23 2021-06-13 11:40:00 2021-06-13 12:02:00
2 2021-10-01 15:06:10 2021-10-01 15:09:23 2021-06-27 16:26:00 2021-06-27 16:39:00
3 2021-10-28 23:02:53 2021-10-28 23:07:11 2021-06-10 20:06:00 2021-06-10 20:28:00
4 2021-10-17 00:58:17 2021-10-17 01:02:08 2021-06-11 15:54:00 2021-06-11 16:11:00
5 2021-10-27 18:29:34 2021-10-27 18:34:48 2021-06-05 14:09:00 2021-06-05 14:42:00
6 2021-10-17 13:30:21 2021-10-17 13:35:26 2021-06-05 14:14:00 2021-06-05 14:37:00
7 2021-10-04 19:59:28 2021-10-04 21:06:24 2021-06-16 19:05:00 2021-06-16 19:16:00
8 2021-10-10 00:27:09 2021-10-10 00:39:58 2021-06-23 20:29:00 2021-06-23 20:43:00

data:

structure(list(started_at = c("2021-10-29 17:42:36", "2021-10-01 15:06:10", 
"2021-10-28 23:02:53", "2021-10-17 00:58:17", "2021-10-27 18:29:34", 
"2021-10-17 13:30:21", "2021-10-04 19:59:28", "2021-10-10 00:27:09"
), ended_at = c("2021-10-29 18:00:23", "2021-10-01 15:09:23", 
"2021-10-28 23:07:11", "2021-10-17 01:02:08", "2021-10-27 18:34:48", 
"2021-10-17 13:35:26", "2021-10-04 21:06:24", "2021-10-10 00:39:58"
), started_at_1 = c("6/13/21 11:40", "6/27/21 16:26", "6/10/21 20:06", 
"6/11/21 15:54", "6/5/21  14:09", "6/5/21  14:14", "6/16/21 19:05", 
"6/23/21 20:29"), ended_at_1 = c("6/13/21 12:02", "6/27/21 16:39", 
"6/10/21 20:28", "6/11/21 16:11", "6/5/21  14:42", "6/5/21  14:37", 
"6/16/21 19:16", "6/23/21 20:43")), class = "data.frame", row.names = c(NA, 
-8L))

A bit irrelevant to the question, Can you guide how did you manage to read that data from that image? @TarJae — shafee, Jul 09 '22 at 19:01
@TarJae Thank you for you effort and time to explain further — admo7000, Jul 09 '22 at 22:05
@TarJae This works now and i have successfully unified my format across the column — admo7000, Jul 09 '22 at 22:18

How to unify Date & Time format when there inconsistency exists across a single column

1 Answers1