How can I count the frequency of an answer in R?

Question

I have an db that look like this:

ID Group Drink
1   A      yes
2   A      no
3   A      NA
4   B      no
5   B      no
6   B      yes

and I would like measure how many people of group A drinks and how many people in group B drinks.

I am using length(), but this function returns 3 (NA is being considered = yes). How can I fix it?

You are looking for a cross tabulation which can be achieved using the `table` function. — Brian Diggs, May 16 '14 at 15:08

score 4 · Accepted Answer · answered May 16 '14 at 15:08

table() is one option:

db <- read.table(text = "ID Group Drink
1   A      yes
2   A      no
3   A      NA
4   B      no
5   B      no
6   B      yes", header = TRUE)

with(db, table(Drink))
with(db, table(Group, Drink))

> with(db, table(Drink))
Drink
 no yes 
  3   2 
> with(db, table(Group, Drink))
     Drink
Group no yes
    A  1   1
    B  2   1

Including the NA as a class is achieved by the useNA argument:

with(db, table(Drink, useNA = "ifany"))

> with(db, table(Drink, useNA = "ifany"))
Drink
  no  yes <NA> 
   3    2    1

You can of course store the objects returned by table() and access them as any other matrix/array:

tab <- with(db, table(Drink, useNA = "ifany"))
tab[1]
tab2 <- with(db, table(Group, Drink, useNA = "ifany"))
tab2[,1]
tab2[1,]

> tab <- with(db, table(Drink, useNA = "ifany"))
> tab[1]
no 
 3 
> tab <- with(db, table(Drink, useNA = "ifany"))
> tab[1]
no 
 3 
> tab2 <- with(db, table(Group, Drink, useNA = "ifany"))
> tab2[,1]
A B 
1 2 
> tab2[1,]
  no  yes <NA> 
   1    1    1

score 2 · Answer 2 · answered May 16 '14 at 15:12

Here's another way using aggregate(...)

aggregate(Drink~Group,df,function(x)sum(x=="yes"))
#   Group Drink
# 1     A     1
# 2     B     1

To get the percent that drink:

aggregate(Drink~Group,df,function(x)sum(x=="yes")/length(!is.na(x)))
#   Group     Drink
# 1     A 0.5000000
# 2     B 0.3333333

score 1 · Answer 3 · answered May 16 '14 at 15:21

xtabs is another option:

xtabs(~ Group + Drink, df)

#     Drink
#Group no yes
#    A  1   1
#    B  2   1

And in case you need a data.frame as output:

as.data.frame(xtabs(~ Group + Drink, df))

#  Group Drink Freq
#1     A    no    1
#2     B    no    2
#3     A   yes    1
#4     B   yes    1

Rich Scriven · Answer 4 · 2014-05-17T03:58:53.630

Assuming d is your data, and assuming NA is considered yes (since you stated that in your post), the proportions of drinkers are

> d$Drink[is.na(d$Drink)] <- 'yes'
> tab <- table(d$Group, d$Drink)
> tab[,'yes']/rowSums(tab)
##         A         B 
## 0.6666667 0.3333333

You could also play around with the count function in package plyr

> library(plyr)
> x <- count(d)
> cbind(x[x$Drink == 'yes', ], inGroup = count(d$Group)$freq)
#   Group Drink freq inGroup
# 2     A   yes    2       3
# 4     B   yes    1       3

score -1 · Answer 5 · answered May 16 '14 at 18:43

-1

You could also make use of the function prop.table() which will add proportions to your table values that are returned.

answered May 16 '14 at 18:43

lawyeR

7,488
5
33
63

How can I count the frequency of an answer in R?

5 Answers5