How do I count the number of unique values across multiple columns in R?

Question

I have a large dataset - 23500 rows. Every row has doublets of events and I need to count the unique events. So I need to count unique events across 30 columns for each row - and create a new column for each row with the count. How to do this the easiest way?

It would be easier to help if you create a small reproducible example along with expected output. Read about [how to give a reproducible example](http://stackoverflow.com/questions/5963269). — Ronak Shah, Jun 28 '21 at 03:47

score 1 · Answer 1 · answered Jun 27 '21 at 23:47

Use apply with MARGIN = 1 to loop over the row, get the unique elements and find the length in base R

df1$new <- apply(df1, 1, FUN = function(x) length(unique(x[complete.cases(x)])))

Or another option is rowwise in dplyr

library(dplyr)
df1 %>%
   rowwise %>%
   mutate(new = n_distinct(c_across(everything()), na.rm = TRUE)) %>%
   ungroup

score 1 · Answer 2 · answered Jun 27 '21 at 23:54

1

Or maybe this one:

library(dplyr)
library(purrr)

df %>%
  mutate(new = pmap_dbl(select(cur_data(), everything()), ~ n_distinct(c(...), na.rm = TRUE)))

answered Jun 27 '21 at 23:54

Anoushiravan R

21,622
3
18
41

How do I count the number of unique values across multiple columns in R?

2 Answers2