0

Say I have data in the following format:

categoricalVar,  numericVar,  responseVar
           Foo,           1,         TRUE
           Bar,           0,         TRUE
           Baz,           2,        FALSE
       ...
       ...
       ... MUCH MUCH MORE

I want to create a bar plot where the X axis would be the 3 different types of categoricalVar, and Y axis would the percentage of them that turned out to be TRUE. A table would work too, like this.

           Foo,   Bar,   Baz
 respPct   0.4,   0.6,   0.9

So out of all the Foos, the percentage of TRUE was 0.4.

The same thing for numericVar would be nice.

             0,     1,     2, ....
 respPct   0.1,   0.2,   0.2

Although I think it makes sense to group the numericVar together, as follows:

           0-5,  5-10, 10-15, ....
 respPct   0.2,   0.3,   0.6

Can someone point me in the right direction?

Jaap
  • 81,064
  • 34
  • 182
  • 193
user3240688
  • 1,188
  • 3
  • 13
  • 34
  • 1
    For this time I've [created example data](http://stackoverflow.com/a/33031156/2204410) for you, but next time please give a [minimal reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610). This will make it a lot easier for other to help you. – Jaap Oct 09 '15 at 06:26

2 Answers2

3

First you have to transform your numericVar into a categorial variable. But let's first create some example data:

set.seed(2)
df <- data.frame(catVar = rep(c("foo","bar","saz"),each=10),
                 respVar = c(sample(c(TRUE,TRUE,TRUE,FALSE,TRUE), 10, replace =TRUE),
                             sample(c(FALSE,TRUE,TRUE,FALSE,TRUE), 10, replace =TRUE),
                             sample(c(FALSE,FALSE,TRUE,FALSE,TRUE), 10, replace =TRUE)),
                 numVar = sample(0:15, 30, replace =TRUE))

1: create a categorical variable for numVar with:

df$catNum <- cut(df$numVar, breaks = c(-Inf,5,10,Inf), labels = c("0-5", "5-10", "10-15"))

2: aggregate the data with:

df2 <- aggregate(respVar ~ catVar, df, FUN = function(x) sum(x)/length(x))
df3 <- data.frame(table(df$catNum)/30)

3: create some plots with:

ggplot(df2, aes(x=catVar, y=respVar)) +
  geom_bar(stat="identity")

enter image description here

ggplot(df3, aes(x=Var1, y=Freq)) +
  geom_bar(stat="identity")

enter image description here

Jaap
  • 81,064
  • 34
  • 182
  • 193
1
   df <- data.frame(a = c("0-5", "5-10", "10-15"), respPct =  c(0.2,   0.3,   0.6))
    library(ggplot2)
    ggplot(aes( x= a, y = respPct), data = df) + geom_bar(stat = "identity")

enter image description here

Mateusz1981
  • 1,817
  • 17
  • 33