I am importing data from a csv file where the date column contains dates recorded in different formats. I wish to parse the column so that it has the class date
and such that all of the dates are formatted in the same style (i.e %d-%m-%Y). I wish to use lubridate
as I have some experience with it and want to get better using it.
I have looked for answers here Parsing dates with different formats and here Parsing dates in multiple formats in R using lubridate but I found the answers incomplete.
Typically when I import csv data I change the col_types
like so:
potatoes <- read_csv("data/potato_prices.csv",
col_types = cols(
DATE = col_date(format = "%Y-%m-%d"),
'M04003DE00BERM372NNBR' = col_double())) %>%
rename("Price" = "M04003DE00BERM372NNBR")
but because my DATE column contains dates in different formats, dates not formatted like "%Y-%m-%d"
return an NA and the class of the column appears as unknown.
I have tried col_guess
, instead of specifying with col_date
with the exact date formats and then mutating the DATE column with the following code, but it has not worked as I would like.
potatoes <- read_csv("data/potato_prices.csv",
col_types = cols(
DATE = col_guess(),
'M04003DE00BERM372NNBR' = col_double()))
potatoes <- potatoes %>%
mutate(DATE = parse_date_time(DATE, orders = c("Ymd", "dmY"))) %>%
rename("Price" = "M04003DE00BERM372NNBR")
Here is an example of how my data appears in excel in csv format
DATE <- c("1879-01-01", "1879-02-01", "1879-03-01", "1879-04-01", "1/05/1990", "1/06/1990", "1/07/1990", "1/08/1990", "1/09/1990", "1/10/1990")
Price <- c("23", "17.9", "17.8", "18", "20", "22", "20", "19", "17.2", "15")
spuds <- data.frame(DATE, Price)
I wish to have a tibble with two columns; DATE as class col_date
and Price as class col_double
. I will then create plots using ggplot
and I think it will be easiest if my DATE column is in class date.
Thanks