RStudio: Separate YYYY-MM-DD into Individual Columns

Question

I am fairly new to R and I am pulling my hair out trying to do what is probably something super simple.

I downloaded the crime data for Los Angeles from 2010 - 2019. There are 2,114,010 rows of data. Right now, it is called 'df' in my Global Environment area.

I want to manipulate one specific column titled "Occurred" - which is a date reference to when the crime occurred.

Right now, it is set up as YYYY-MM-DD (ie., 2010-02-20).

I am trying to separate all three into individual columns. I have Googled, and Googled, and Googled and tried and tried and tried things from this forum and StackExchange and just cannot get it to work.

I have tried Lubridate and followed instructions to other answers, but it simply won't create new columns (one each for Year, Month, Day).

Here is a bit of the reprex from the dataset ... I did not include all of the different variables, because they aren't the issue.

As mentioned, I am trying to separate 'occurred' into individual Year, Month, and Day columns.

> head(df, 10)[c('dr_no','occurred','time','area_name')]
       dr_no   occurred time area_name
1    1307355 2010-02-20 1350    Newton
2   11401303 2010-09-12   45   Pacific
3   70309629 2010-08-09 1515    Newton
4   90631215 2010-01-05  150 Hollywood
5  100100501 2010-01-02 2100   Central
6  100100506 2010-01-04 1650   Central
7  100100508 2010-01-07 2005   Central
8  100100509 2010-01-08 2100   Central
9  100100510 2010-01-09  230   Central
10 100100511 2010-01-06 2100   Central

`transform(df, year = format(occurred, "%Y"), month = format(occurred, "%m"), day = format(occurred, "%d"))` — Ronak Shah, Apr 25 '20 at 12:26
I can see the process working in the console as it runs through all of the data. It is adding year, month, and day. But, when I view the data, the new columns are not there. The number of variables in the data also remained the same (28). — fiverings84, Apr 25 '20 at 13:20
You need to assign the data back to an object. `df1 <- transform(df, year = format(.......` and then check `df1` — Ronak Shah, Apr 25 '20 at 13:29

score 2 · Accepted Answer · answered Apr 25 '20 at 19:54

2

We can do this with tidyverse and lubridate

library(dplyr)
library(lubridate)
df <- df %>%         
       mutate(occurred = as.Date(occurred), 
              year = year(occurred), month = month(occurred), day = day(occurred))

answered Apr 25 '20 at 19:54

akrun

874,273
37
540
662

RStudio: Separate YYYY-MM-DD into Individual Columns

1 Answers1