0

I have a column (called eventCat) in my data frame of 5 factors (Drought, Dry, Normal, Wet, Storm) e.g.

eventCat
Dry
Dry
Drought
Drought
Wet
Storm
Storm 
Normal 
Normal
Dry
Dry

I want to provide an ID to each group of events, so that the df looks like this (Note different IDs for the different Dry events):

eventCat          eventCatID
Dry               1
Dry               1
Drought           2
Drought           2
Wet               3
Storm             4
Storm             4
Normal            5
Normal            5
Dry               6
Dry               6
Melanie Baker
  • 449
  • 1
  • 13
  • `ID <- data.table::rleid(eventCat)`. Note, if I understand you correctly, you have one factor with five levels, not five factors. – Limey Jul 26 '22 at 10:34
  • Should `Dry` both have eventCatID `1` AND `6`? If not: `mutate(eventCat = as_factor(eventCat) |> as.numeric())` is the way to go. – harre Jul 26 '22 at 10:36
  • @harre No I want the separate events for the different Dry periods. The answer I have ticked as correct has done this for me. – Melanie Baker Jul 26 '22 at 10:52

1 Answers1

2

For this example you could increase the eventCatID by one every time eventCat is different from the previous eventCat (no change if it's the same), e.g.

library(dplyr)

df <- structure(list(eventCat = c("Dry", "Dry", "Drought", "Drought", 
                                  "Wet", "Storm", "Storm", "Normal",
                                  "Normal", "Dry", "Dry")),
                class = "data.frame", row.names = c(NA, -11L))

df %>%
  mutate(eventCatID = 1 + cumsum(eventCat != lag(eventCat, default = first(eventCat))))
#>    eventCat eventCatID
#> 1       Dry          1
#> 2       Dry          1
#> 3   Drought          2
#> 4   Drought          2
#> 5       Wet          3
#> 6     Storm          4
#> 7     Storm          4
#> 8    Normal          5
#> 9    Normal          5
#> 10      Dry          6
#> 11      Dry          6

Created on 2022-07-26 by the reprex package (v2.0.1)

But this relies on the eventCat's being in the 'right' order. Does this work with your real data?

jared_mamrot
  • 22,354
  • 4
  • 21
  • 46