0

I have a large dataset - 23500 rows. Every row has doublets of events and I need to count the unique events. So I need to count unique events across 30 columns for each row - and create a new column for each row with the count. How to do this the easiest way?

  • It would be easier to help if you create a small reproducible example along with expected output. Read about [how to give a reproducible example](http://stackoverflow.com/questions/5963269). – Ronak Shah Jun 28 '21 at 03:47

2 Answers2

1

Use apply with MARGIN = 1 to loop over the row, get the unique elements and find the length in base R

df1$new <- apply(df1, 1, FUN = function(x) length(unique(x[complete.cases(x)])))

Or another option is rowwise in dplyr

library(dplyr)
df1 %>%
   rowwise %>%
   mutate(new = n_distinct(c_across(everything()), na.rm = TRUE)) %>%
   ungroup
akrun
  • 874,273
  • 37
  • 540
  • 662
1

Or maybe this one:

library(dplyr)
library(purrr)

df %>%
  mutate(new = pmap_dbl(select(cur_data(), everything()), ~ n_distinct(c(...), na.rm = TRUE)))
Anoushiravan R
  • 21,622
  • 3
  • 18
  • 41