2

I'm making a facet grid bar plot using ggplot2. I keep getting the error message "Removed 15 rows containing missing values (geom_bar)" when there's no missing values in my dataset.

Here's a reproducible example where I'm plotting the likeability of ice cream flavours by schools:

IDs <- seq(1,50)
IDs <- data.frame(rep(IDs, each = 5))
names(IDs)[1] <- "ID"

tastes <- c("Strawberry", "Vanilla", "Chocolate", "Matcha", "Sesame")
tastes <- data.frame(rep(tastes, times = 50))

#random numbers for schools 
A <- runif(250, 1,5)
B <- runif(250, 1,5)
C <- runif(250, 1,5)

#merge
test <- cbind(IDs, tastes)
test <- cbind(test, A)
test <- cbind(test, B)
test <- cbind(test, C)
names(test)[2] <- "Flavour"
#make long
test_long <- melt(test, 
                   id.vars = c("ID", "Flavour"))

#plot
plot <- ggplot(test_long) +
  geom_bar(aes(x = Flavour,
               y = value), stat="summary", fun=mean) + 
  scale_x_discrete(labels=c("C","M","S","S","V")) +
  scale_y_continuous(name = "Rating", limits = c(1, 5)) +
  facet_grid(. ~ variable) + 
  labs(title = "Likeability of Different Flavours by School") +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
        panel.background = element_blank(), axis.line = element_line(colour = "black"))
plot

Does anyone know why the error message keeps coming up? Thank you!

jo_
  • 677
  • 2
  • 11

1 Answers1

2

The bars start at zero, and so using scale_y_continuous(limits=c(1,5)) trims all the bars as they exceed the plot window. You can fix this by setting the lower limit to 0.

Alternatively, you can replace scale_y_continuous() (which trims the data to only that in the plot data) with coord_cartesian(ylim=c(1,5)) which plots data that would go outside the plot window.

The help file for ?coord_cartesian explains a bit about the two different methods of setting the axes, which can make a big difference when plotting summaries of the data, or fitting smoothers.

Miff
  • 7,486
  • 20
  • 20