Count occurrences of factor in R, with zero counts reported

Question

I want to count the number of occurrences of a factor in a data frame. For example, to count the number of events of a given type in the code below:

library(plyr)
events <- data.frame(type = c('A', 'A', 'B'),
                       quantity = c(1, 2, 1))
ddply(events, .(type), summarise, quantity = sum(quantity))

The output is the following:

     type quantity
1    A        3
2    B        1

However, what if I know that there are three types of events A, B and C, and I also want to see the count for C which is 0? In other words, I want the output to be:

     type quantity
1    A        3
2    B        1
3    C        0

How do I do this? It feels like there should be a function defined to do this somewhere.

The following are my two not-so-good ideas about how to go about this.

Idea #1: I know I could do this by using a for loop, but I know that it is widely said that if you are using a for loop in R, then you are doing something wrong, there must be a better way to do it.

Idea #2: Add dummy entries to the original data frame. This solution works but it feels like there should be a more elegant solution.

events <- data.frame(type = c('A', 'A', 'B'),
                       quantity = c(1, 2, 1))
events <- rbind(events, data.frame(type = 'C', quantity = 0))
ddply(events, .(type), summarise, quantity = sum(quantity))

`e <- sapply(events, FUN=as.factor); table(e)` – isomorphismes Feb 10 '14 at 06:35 — isomorphismes, Feb 10 '14 at 06:35

score 23 · Accepted Answer · answered Apr 18 '13 at 03:29

You get this for free if you define your events variable correctly as a factor with the desired three levels:

R> events <- data.frame(type = factor(c('A', 'A', 'B'), c('A','B','C')), 
+                       quantity = c(1, 2, 1))
R> events
  type quantity
1    A        1
2    A        2
3    B        1
R> table(events$type)

A B C 
2 1 0 
R>

Simply calling table() on the factor already does the right thing, and ddply() can too if you tell it not to drop:

R> ddply(events, .(type), summarise, quantity = sum(quantity), .drop=FALSE)
  type quantity
1    A        3
2    B        1
3    C        0
R>

score 4 · Answer 2 · answered Apr 18 '13 at 03:41

4

> xtabs(quantity~type, events)
type
A B C 
3 1 0

answered Apr 18 '13 at 03:41

IRTFM

258,963
21
364
487

Doh, even better. Nice. Somehow I always forget about `xtabs`. But also needs the corrected factor variable I show. – Dirk Eddelbuettel Apr 18 '13 at 03:44
I only used the OP's data. There is an implicit `sum` in `xtabs`. – IRTFM Apr 18 '13 at 04:02

score 2 · Answer 3 · answered Oct 11 '17 at 08:08

Using dplyr library

library(dplyr)
data <- data.frame(level = c('A', 'A', 'B', 'B', 'B', 'C'),
                   value = c(1:6))

data %>%
  group_by(level) %>%
  summarize(count = n()) %>%
  View

If you choose also to perform mean, min, max operations, try this

data %>%
  group_by(level) %>%
  summarise(count = n(), Max_val = max(value), Min_val = min(value)) %>%
  View

score 0 · Answer 4 · answered Apr 18 '13 at 03:45

0

Quite similar to @DWin's answer:

> aggregate(quantity~type, events, FUN=sum)
  type quantity
1    A        3
2    B        1
3    C        0

answered Apr 18 '13 at 03:45

Matthew Lundberg

42,009
6
90
112

@DirkEddelbuettel Or his definition, with the dummy entries (what I actually used). – Matthew Lundberg Apr 18 '13 at 03:47
Which amounts to the same in a more convoluted way -- the char variable gets turned into a factor later by aggregate. – Dirk Eddelbuettel Apr 18 '13 at 03:49

score 0 · Answer 5 · answered May 30 '19 at 21:43

0

In data you put your dataframe and into levels your categories.

table(factor(data, levels = 1:5))

answered May 30 '19 at 21:43

Zanoldor

378
6
13

score 0 · Answer 6 · answered Mar 21 '21 at 05:03

Turn type column to factor and use count.

library(dplyr)
library(tidyr)

events %>% count(type = factor(type, c('A', 'B', 'C')), .drop = FALSE)

#  type n
#1    A 2
#2    B 1
#3    C 0

Another option is complete.

events %>%
  count(type) %>%
  complete(type = c('A', 'B', 'C'), fill = list(n = 0))

Count occurrences of factor in R, with zero counts reported

6 Answers6

Linked