I have a large dataset - 23500 rows. Every row has doublets of events and I need to count the unique events. So I need to count unique events across 30 columns for each row - and create a new column for each row with the count. How to do this the easiest way?
Asked
Active
Viewed 723 times
0
-
It would be easier to help if you create a small reproducible example along with expected output. Read about [how to give a reproducible example](http://stackoverflow.com/questions/5963269). – Ronak Shah Jun 28 '21 at 03:47
2 Answers
1
Use apply
with MARGIN = 1
to loop over the row, get the unique
elements and find the length
in base R
df1$new <- apply(df1, 1, FUN = function(x) length(unique(x[complete.cases(x)])))
Or another option is rowwise
in dplyr
library(dplyr)
df1 %>%
rowwise %>%
mutate(new = n_distinct(c_across(everything()), na.rm = TRUE)) %>%
ungroup

akrun
- 874,273
- 37
- 540
- 662
1
Or maybe this one:
library(dplyr)
library(purrr)
df %>%
mutate(new = pmap_dbl(select(cur_data(), everything()), ~ n_distinct(c(...), na.rm = TRUE)))

Anoushiravan R
- 21,622
- 3
- 18
- 41