1

I'm trying to use summarise() from the plyr-packge to calculate percentages of occurences of each level in a factor. EDIT: The Puromycin data is in the base R installation

My data look like this:

library(plyr)
data.p <- as.data.frame(Puromycin[,3])
names(data.p) <- "Treat.group" 

I've done this:

    summarise(  data.p, "Frequencies"= count(data.p), 
"Percent" = count(data.p)/ sum(count(data.p)[2] ))

And got this:

  Frequencies.Treat.group Frequencies.freq Percent.Treat.group Percent.freq
1                 treated               12                  NA    0.5217391
2               untreated               11                  NA    0.4782609 

But I don't want the 3. column to be generated. It is unnecessary, and only shows NA.

How do I write the code so I don't get that NA column?

Any pointers are appreciated :)

Zach Saucier
  • 24,871
  • 12
  • 85
  • 147
Rene Bern
  • 545
  • 3
  • 10
  • 18

1 Answers1

4

Your error was coming from:

count(data.p)/ sum(count(data.p)[2] )

If you look at the numerator, we get:

R> count(data.p)
  Treat.group freq
1     treated   12
2   untreated   11

So the warning occurred because you were dividing the first column by a number, i.e. treated/12, which gives NA. To avoid this, just select the second column of count(data.p):

summarise(data.p, 
             "Frequencies"= count(data.p), 
             "Percent" = count(data.p)[,2]/ sum(count(data.p)[2]))
csgillespie
  • 59,189
  • 14
  • 150
  • 185