2

Background

Here's a dataframe d:

d <- data.frame(ID = c("a","a","b","b"),                  
                product_code = c("B78","X31","C12","C12"),
                multiple_products = c(1,1,0,0),
                stringsAsFactors=FALSE)

The Problem & What I Want

I'm trying to make a cross-tabulation-style frequency table of multiple_products using base R's table function, but I want to do so by ID and not by row. Here's what I'm looking for:

0 1 
1 1 

In other words, a table that says "there's 1 ID where multiple_products equals 0, and 1 ID where it equals 1".

What I've Tried

Here's my attempt so far using dplyr:

dtable <- d %>%
  group_by(ID) %>%
  table(d$multiple_products) %>%
  ungroup()

This code runs on my real dataset without errors, but it gives me the same result that table(d$multiple_products) would, namely this:

0 1 
2 2 

Which indicates "2 rows where multiple_products equals 0, and 2 rows where it equals 1".

In the toy example I'm giving you here, this code doesn't even run, giving me the following error:

Error: Can't combine `ID` <character> and `multiple_products` <double>.

Any thoughts?

logjammin
  • 1,121
  • 6
  • 21

1 Answers1

2

We need to check n_distinct by group

library(dplyr)
d %>% 
    group_by(multiple_products) %>% 
    summarise(n = n_distinct(ID))

-output

# A tibble: 2 x 2
  multiple_products     n
              <dbl> <int>
1                 0     1
2                 1     1
akrun
  • 874,273
  • 37
  • 540
  • 662