Create better frequency tables in R

Question

Here is some data:

dta <- data.frame(
  id = 1:10, 
  code1 = as.factor(sample(c("male", "female"), 10, replace = TRUE)),
  code2 = as.factor(sample(c("yes", "no", "maybe"), 10, replace = TRUE)),
  code3 = as.factor(sample(c("yes", "no"), 10, replace = TRUE))
)

I would like a nicely formatted frequency table for the code variables.

codes <- c("code1", "code2", "code3")

For example, we can run the build-in command table.

> sapply(dta[, codes], table)
$code1

female   male 
     4      6 

$code2

maybe    no   yes 
    5     2     3 

$code3

 no yes 
  4   6

All the information is here, but what would be nice is to have a table thusly:

library(plyr)
ddply(dta, .(code1), summarize, n1 = length(code1))
   code1 n1
1 female  4
2   male  6

And this three times. Can be separate dataframes or all in one.

How can we loop over the variables? Or any other approaches.

http://cran.r-project.org/web/packages/sjPlot/index.html – aatrujillob Mar 13 '14 at 19:50 — aatrujillob, Mar 13 '14 at 19:50

score 1 · Answer 1 · answered Mar 13 '14 at 17:38

You could use lapply with as.data.frame(table)

codes <- c("code1", "code2", "code3")
tbl<-lapply(dta[, codes], as.data.frame(table))

Which will give you:

tbl
$code1
  value.Var1 value.Freq
1     female          6
2       male          4

$code2
  value.Var1 value.Freq
1      maybe          4
2         no          5
3        yes          1

$code3
  value.Var1 value.Freq
1         no          4
2        yes          6

So you can access each data frame with tbl$code1, tbl$code2 and so on. For example:

tbl$code1
  value.Var1 value.Freq
1     female          6
2       male          4

score 1 · Answer 2 · answered Mar 13 '14 at 17:47

This work (generic in that it doesn't require knowing the "codes" first)?

library(plyr)
library(reshape)

dta <- data.frame(
  id = 1:10, 
  code1 = as.factor(sample(c("male", "female"), 10, replace = TRUE)),
  code2 = as.factor(sample(c("yes", "no", "maybe"), 10, replace = TRUE)),
  code3 = as.factor(sample(c("yes", "no"), 10, replace = TRUE))
)

d1 <- melt(dta, "id")

d2 <- count(d1, .(variable, value))

d3 <- by(d2, d2$variable, function(x) {
  v <- as.character(x[1,]$variable)
  y <- x[,2:3]
  colnames(y) <- c(v, "n1")
  return(y)
})

d3 

## d2$variable: code1
##    code1 n1
## 1 female  6
## 2   male  4
## ----------------------------------------------------------------------- 
## d2$variable: code2
##   code2 n1
## 3 maybe  2
## 4    no  5
## 5   yes  3
## ----------------------------------------------------------------------- 
## d2$variable: code3
##   code3 n1
## 6    no  4
## 7   yes  6

Create better frequency tables in R

2 Answers2