I have some data similar to this example data, where columns stand for years:
data <- data.frame(
X2020_1 = c("A", NA, "B"),
X2020_2 = c("A", "C", NA),
X2021_1 = c("A", NA, "C"),
X2021_2 = c(NA, NA, "A")
)
which looks like this:
X2020_1 X2020_2 X2021_1 X2021_2
1 A A A <NA>
2 <NA> C <NA> <NA>
3 B <NA> C A
I want to group the data by columns based on the column name. I want anything with X2020
in its name to be in a new X2020
column, and anything with X2021
to be in a new X2021
column. I also want to eliminate NA
values unless there are only NA
s in that column-group.
My desired end result data should look like this:
X2020 X2021
1 c(A, A) A
2 C <NA>
3 B c(C, A)
I'm not really sure how to get there.