1

I would like to produce histograms with density lines for all my numeric columns, and facet them by another column.

Using the Iris data set as an example, I would like to produce histograms for Sepal.Length, etc., with facets for each of Species.

This is what I have tried:

for (i in colnames(subset(iris, select = -`Species`))) {
plot=  ggplot(iris, aes(x= i))+
    geom_histogram()+
    geom_density(colour = "blue", size = 1) +
    facet_wrap(~ Species, scales = "free")

 print(plot)
}

I also tried

for (i in colnames(subset(iris, select = -`Species`))) {
plot=  ggplot(subset(iris, select = -`Species`), aes(x= i))+
    geom_histogram()+
    geom_density(colour = "blue", size = 1) +
    facet_wrap(~ iris$Species, scales = "free")

 print(plot)
}

The error I get is

Error in f(): StatBin requires a continuous x variable: the x variable is discrete. Perhaps you want stat="count"?

Do I need to put something in the geom_histogram() command?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Mark Davies
  • 787
  • 5
  • 18

2 Answers2

2

Update OP request see comments:

library(tidyverse)
irislong= pivot_longer(iris, cols = -Species)  
ggplot(irislong, aes(x= value, fill= Species, alpha = 0.4))+   
  geom_histogram(aes(y = ..density..))+   
  geom_density(colour = "blue", size = 1)+    
  facet_wrap(~ name, scales = "free")

enter image description here

First answer: Here is one possible solution:

  1. We bring the data in long format, with pivot_longer. Then apply fill= name and facet_wrap like you did:
library(tidyverse)

iris %>% 
  pivot_longer(-Species) %>% 
  ggplot(aes(x = value, fill=name))+
    geom_histogram() +
    geom_density(colour = "blue", size = 1) +
    facet_wrap(~ Species, scales = "free")

Enter image description here

TarJae
  • 72,363
  • 6
  • 19
  • 66
  • 1
    this is very helpful. I originally wanted a separate plot for each variables, but this maybe more useful. I've now done this: `irislong= pivot_longer(iris, cols = -Species) ggplot(irislong, aes(x= value, colour= Species))+ geom_histogram(aes(y = ..density..))+ geom_density(colour = "blue", size = 1)+ facet_wrap(~ name, scales = "free")` How can I get a separate`geom_density` line for each species? – Mark Davies Nov 12 '22 at 18:10
  • Please see my update. By changing `colour` aesthetics with `fill` you might get your desired output! – TarJae Nov 12 '22 at 18:38
0

I found a helpful answer to this question. I needed to use aes_string. This is what got the desired effect for my original question:

for (i in colnames(subset(iris, select = -`Species`))) {
plot=  ggplot(subset(iris, select = -`Species`), aes_string(x= i))+
    geom_histogram(aes(y = ..density..))+
    geom_density(colour = "blue", size = 1)+ 
    facet_wrap(~ iris$Species, scales = "free")
  
 print(plot)
}

producing separate plots for each variable.

Mark Davies
  • 787
  • 5
  • 18
  • 2
    As tidyverse [docs](https://ggplot2.tidyverse.org/reference/aes_.html) indicate, `aes_string` is soft-deprecated among other aesthetic mappings. Instead use `.data` pronoun: `aes(.data[[i]])`. – Parfait Nov 12 '22 at 18:27