1

This is a bit of a newbie question. I am using the package "nycflights13" in R, and "tidyverse".

library(nycflights13)  
library(tidyverse)

I am trying to get a bar chart that shows the total number of flights by airline/carrier, and have it color each bar by the number of flights that occurred each month.

I can get a simple bar chart to show with the following:

ggplot(flights) +  
    geom_bar(mapping=aes(x=carrier))

When I try to color it with the month, it doesn't change anything.

ggplot(flights) +  
    geom_bar(mapping=aes(x=carrier, fill=month))

The graph generated by the code above looks exactly the same.

It seems to work when I do the opposite... if I create a chart with "month" on the x-axis and color by carrier, it works just like I would expect.

ggplot(flights) +  
    geom_bar(mapping=aes(x=month,fill=carrier))

I assume it has something to do with discrete vs continuous variables?

ksulli10
  • 13
  • 5
  • Are you looking for something like this `ggplot(flights) + geom_bar(mapping=aes(x=carrier, fill=as.factor(month)))` – M.Viking Jun 20 '19 at 02:16
  • 1
    That's exactly what I was looking for! Thank you! Is there a reason why it needs the "as.factor" part? I think I just don't understand that portion yet. Also, if you put this as an answer I think I can mark it solved? – ksulli10 Jun 20 '19 at 02:24
  • 1
    Great! In this case it's impossible for `ggplot2` to understand how to map the continuous month variable onto the bar chart by airline carrier. Especially since the data looks like `flights$months ([1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1...)`. Converting to factor lets `ggplot2` process 'there are these many segments to fill, and they have this much volume'. – M.Viking Jun 20 '19 at 02:42

1 Answers1

2

Yes, this has to do with discrete vs continuous variables. as.factor() will convert month to discrete factors.

ggplot(flights) + 
    geom_bar(mapping=aes(x=carrier, fill=as.factor(month))) 

For fun, there is a way to override geom_bar's built in stat_count default. This requires adding a dummy variable to flights, to use as a y, and sorting the data by month (or you get weird artifacts). Look at the help document about ?geom_bar().

flights$n<-1

flights%>%
  arrange(month)%>%
  ggplot(aes(carrier, n, fill = month)) +
  geom_bar(stat = "identity") +
  scale_fill_continuous(low="blue", high="red") 
M.Viking
  • 5,067
  • 4
  • 17
  • 33