0

I am a beginner in R and I'm currently working on a dataset in which we find concentration of different particules such as CO2, NO2 at differents times (date and hours).

How I can add a column in which I match the date with the correct season?

Here is the dataframe I'm working on:

enter image description here

And here is the code I wrote on Rstudio to try solve my problem:

enter image description here

Phil
  • 7,287
  • 3
  • 36
  • 66
Raphael42
  • 31
  • 3
  • 2
    Welcome to Stack Overflow! Please do not use images to post code or error message unless the image is conveying something more than the code or error message. If you need to you should augment the image with the text it contains because images of text are not searchable, accessible, and they make it harder for people trying to help you. Please read also [mcve]. – help-info.de Oct 01 '22 at 09:31
  • 2
    Please read why not upload images of code and/or errors on SO when asking a question [Discourage screenshots of code and/or errors](https://meta.stackoverflow.com/questions/303812/discourage-screenshots-of-code-and-or-errors). – help-info.de Oct 01 '22 at 09:32
  • @Raphael42 Take a look at [this answer](https://stackoverflow.com/a/67358497/3460670) or [this post](https://stackoverflow.com/questions/36502140/determine-season-from-date-using-lubridate-in-r) - they likely will do what you need. – Ben Oct 02 '22 at 16:23

1 Answers1

1
library(tidyverse)
library(lubridate)

Consider this sample data:

# A tibble: 36 x 2
   date         co2
   <date>     <int>
 1 2022-01-01    62
 2 2022-01-02    59
 3 2022-01-03    55
 4 2022-02-01    90
 5 2022-02-02    66
 6 2022-02-03    74
 7 2022-03-01   104
 8 2022-03-02   103
 9 2022-03-03    74
10 2022-04-01    70
# ... with 26 more rows
# i Use `print(n = ...)` to see more rows

df %>%
  mutate(season = case_when(month(date) %in% c(12, 1, 2) ~ "Winter", 
                            month(date) %in% c(3, 4, 5) ~ "Spring", 
                            month(date) %in% c(5, 7, 8) ~ "Summer", 
                            TRUE ~ "Autumn"))

# A tibble: 36 x 3
   date         co2 season
   <date>     <int> <chr> 
 1 2022-01-01    62 Winter
 2 2022-01-02    59 Winter
 3 2022-01-03    55 Winter
 4 2022-02-01    90 Winter
 5 2022-02-02    66 Winter
 6 2022-02-03    74 Winter
 7 2022-03-01   104 Spring
 8 2022-03-02   103 Spring
 9 2022-03-03    74 Spring
10 2022-04-01    70 Spring
# ... with 26 more rows
# i Use `print(n = ...)` to see more rows

Specify yourself

df %>%  
  mutate(season = case_when(between(date, as.Date("2022-01-01"), 
                                          as.Date("2022-02-01")) ~ "Winter", 
                            TRUE ~ "Not winter"))
Chamkrai
  • 5,912
  • 1
  • 4
  • 14
  • Thank you so much it already helped me a lot. But instead of using case_when(month(date)), is it possible to use an equivalent function such as something like case_when(month&day(date)) to be more precise. Because with only specifying month,, for instance, 15th of march will be in spring instead of winter..? Thank you – Raphael42 Oct 01 '22 at 10:32
  • @Raphael42 Why would you even do that? – Chamkrai Oct 01 '22 at 11:07
  • I want to have the mean, medium... concentration for each season (ex: winter from december 21 to march 20) – Raphael42 Oct 01 '22 at 11:14
  • Look at the edit, I showed you how with `between()`. You should be able to do the rest yourself and specify it – Chamkrai Oct 01 '22 at 11:27
  • I like your answer and would likely do the same with `case_when`. Another option could be `season = c(rep("Winter", 2), rep(c("Spring", "Summper", "Fall"), each = 3), "Winter")[month(date)]`. Probably not worth another answer, but thought I'd drop it here. – AndS. Oct 01 '22 at 12:49
  • @AndS. Works for my sample data, but now OPs data – Chamkrai Oct 01 '22 at 12:54
  • @TomHoel I agree, but I feel like most people define seasons as you describe them in your answer. I have not heard of Winter being Dec. 21 to Mar. 20. – AndS. Oct 01 '22 at 12:59