16

I am writing a R code that allows users to select columns from a data and plots histograms for each of them. Hence, I am using a 'for' loop to generate the required number of plots using the ggplot2 library and save them in a single list. But the problem I am facing is that, at every iteration of the 'for' loop, all objects in the list are storing the same plot. Thus, the final output consists of a grid of histograms, labeled differently but depicting the same(last) column.

I understand that this question is quite old and I found the answers on renaming ggplot2 graphs in a for loop and https://stat.ethz.ch/pipermail/r-help/2008-February/154438.html to be a useful starting point.

I have used the standard Swiss Fertility dataset available in R to generate the plots. Here is the code:-

data_ <- swiss
data_ <- na.omit(data_)

u <- c(2, 3, 4, 5, 6)
plotData <- data_[,u]
bw <- 5
plotType <- 'probability'

library(ggplot2)
library(gridExtra)

histogramList <- vector('list', length(u))

if(plotType=='probability')
{
 for(i in 1:length(u))
 {
   indexDataFrame <- data.frame(plotData[,i])
   probabilityHistogram <- ggplot(indexDataFrame, aes(x=indexDataFrame[,1]))
   histogramList[[i]] <-  probabilityHistogram + geom_histogram(aes(y=..density..),     binwidth=bw, colour='black', fill='skyblue') + geom_density() + scale_x_continuous(names(plotData)[i]) + opts(legend.position='none')
 }
} else
if(plotType=='frequency')
{
 for(i in 1:length(u))
 {
   indexDataFrame <- data.frame(plotData[,i])
   probabilityHistogram <- ggplot(indexDataFrame, aes(x=indexDataFrame[,1]))
   histogramList[[i]] <- probabilityHistogram + geom_histogram(aes(y=..count..), binwidth=bw, colour='black', fill='skyblue') + geom_density() + scale_x_continuous(names(plotData)[i]) + opts(legend.position='none')
 }
}

arg_list <- c(histogramList, list(nrow=3, ncol=2))
#jpeg('histogram', width=1024, height=968)
do.call(grid.arrange, arg_list)
#graphics.off()

I apologize if I have missed an obvious answer to the question in this forum and shall be grateful if you could direct me towards it. I hope my explanation is clear and if not, please let me know about the clarifications required.

Thanks!

Community
  • 1
  • 1
tejas_kale
  • 593
  • 2
  • 7
  • 21

2 Answers2

18

You can vastly simplify your code by:

  1. Using facets, rather than manually arranging multiple plots
  2. Melting your data with the function melt in package reshape2
  3. This means you can remove the loop

Here is a complete rewrite of your code, with no loop in sight.

data_ <- swiss
data_ <- na.omit(data_)

u <- c(2, 3, 4, 5, 6)
plotData <- data_[,u]
bw <- 5
plotType <- 'frequency'

library(ggplot2)
library(reshape2)

mdat <- melt(plotData)

if(plotType=='probability'){
  ph <- ggplot(mdat, aes(value)) +
    geom_histogram(aes(y=..density..), binwidth=bw, colour='black', fill='skyblue') + 
    geom_density() + 
    facet_wrap(~variable, scales="free")
} 

if(plotType=='frequency'){
  ph <- ggplot(mdat, aes(value)) +
    geom_histogram(aes(y=..count..), binwidth=bw, colour='black', fill='skyblue') + 
    geom_density() + 
    facet_wrap(~variable, scales="free")
}

print(ph)

The resulting graphics:

Probability:

enter image description here

Frequency

enter image description here

Andrie
  • 176,377
  • 47
  • 447
  • 496
7

Instead of mapping aesthetics using aes, you might be better off using aes_string:

 for(i in 1:length(u))
 {
   probabilityHistogram <- ggplot(plotData, aes_string(x=names(plotData)[i]))
   histogramList[[i]] <-  probabilityHistogram + geom_histogram(aes(y=..density..),     binwidth=bw, colour='black', fill='skyblue') + geom_density() + scale_x_continuous(names(plotData)[i]) + opts(legend.position='none')
 }

That worked for me, at least. This avoids having to subset your data and allows you to reference the column you want to plot by quoted name.

BenBarnes
  • 19,114
  • 6
  • 56
  • 74