constructing ggplot calls based on the number of inputs automatically

Question

I have tsv files all of them with one column and same number of rows. I am plotting them using ggplot (stat_smooth) but I would the program to be flexible meaning adding more stat_smooth function calls, based on how many files are provided as input.

The length of the input is taken from length(commandArgs(TRUE)) and I am storing my data in a variable as

cov=data.frame(sapply(1:length(commandArgs(TRUE)),
    function(i)read.csv(proteins[i],sep='\t',colClasses=c(NA,"NULL"))))

where proteins<-commandArgs(TRUE) are the files and I am adding the colnames using another code.

Now, the problem comes with ggplot, how can I make calls to ggplot to make smooth_line calls on the fly depending on the number of arguments provided.

I was trying somthing like,

m=ggplot(cov,aes(seq,cov[,2]))
p=function(i){return(stat_smooth(aes(color=colnames(cov)[i])))}
m+p(1)+.....

adding the p to core ggplot plot initiator m using a for loop but that doesn't seems to makes sense.

There should be a more efficient way of this. The idea would be construct the calls, based on the columns in the cov data.frame which has data like

seq    fileA fileB
1 8429.262  8606.623
2 8766.138  9066.361
3 9081.893  9456.915
4 9342.380  9784.373
5 9480.860 10067.121
6 9581.437 10312.253

Can someone suggest something?

you can try `aes_string`, but reshaping is probably the better option — baptiste, Mar 14 '13 at 18:34

score 1 · Accepted Answer · answered Mar 14 '13 at 17:27

Firtly, reshape your data to the long format with one value per row.

library(reshape2)
covM <- melt(cov, id.var = "seq")

This returns the following data frame:

   seq variable     value
1    1    fileA  8429.262
2    2    fileA  8766.138
3    3    fileA  9081.893
4    4    fileA  9342.380
5    5    fileA  9480.860
6    6    fileA  9581.437
7    1    fileB  8606.623
8    2    fileB  9066.361
9    3    fileB  9456.915
10   4    fileB  9784.373
11   5    fileB 10067.121
12   6    fileB 10312.253

Once you have the new object, it's easy to plot:

library(ggplot2)
ggplot(covM, aes(seq, value)) +
  stat_smooth(aes(color = variable))

enter image description here

constructing ggplot calls based on the number of inputs automatically

1 Answers1