4

Possible Duplicate:
geom_boxplot with precomputed values

I have a table where each row is a different sample and each column is the name, minimum, maximum, mean, 25th percentile, 50th percentile, 75th percentile respectively. Here is a sample.

sample1   1   38   10   8    10   13
sample2   1   39   10   9    11   14
sample3   2   36   11   10   10   13

I would like to know how I can use the data in this format in order to plot boxplots since that is the data that is actually plotted. The format above is a tab separated table. Thanks

Community
  • 1
  • 1
Julio Diaz
  • 9,067
  • 19
  • 55
  • 70
  • @joran thanks for pointing that out, I will check the post you mention and if so I'll close this one. – Julio Diaz Jun 20 '12 at 23:13
  • @GSee's comment has vanished, but was a good point. The `boxplot` base function also accepts precomputed values as well I believe, but I couldn't find a question on SO directly dealing with that function. – joran Jun 20 '12 at 23:14
  • @joran From that post it was unclear to me how I could input data from the tsv format. My table has many more samples so it would be tough to input the data like that. – Julio Diaz Jun 20 '12 at 23:18
  • 1
    So if, as it seems, you don't know how to read data from a file into R, that seems like a separate, and very different question than "how do I make a boxplot with pre computed values". – joran Jun 20 '12 at 23:28
  • When I posted I had no idea how to how do I make a boxplot with pre computed values. Now I have an idea of how to do it inputing manually the data, and I guess now I have a problem translating from the read.tsv to the data.frame. I will edit my question – Julio Diaz Jun 21 '12 at 00:10
  • 2
    Please don't change the very nature of your question. It renders current answers and comments nonsensical. Now that you know how to make boxplots with precomputed values, if you run into specific problems implementing that solution, ask a separate question. – joran Jun 21 '12 at 00:50

2 Answers2

9

This post shows how you can do this with bxp which is the function that boxplot uses, but you need to put your data in the right order with the first row being the minimum, and the last row being the maximum.

First, read in the data

dat <- read.table(text="sample1   1   38   10   8    10   13
sample2   1   39   10   9    11   14
sample3   2   36   11   10   10   13", row.names=1, header=FALSE)

Then, put in order and transpose

dat2 <- t(dat[, c(1, 4, 5, 6, 2)]) #Min, 25pct, 50pct, 75pct, Max

and plot

bxp(list(stats=dat2, n=rep(10, ncol(dat2)))) #n is the number of observations in each group
Community
  • 1
  • 1
GSee
  • 48,880
  • 13
  • 125
  • 145
  • In the last line in n=rep(10,3), 3 is the number of samples, so what is 10? – Julio Diaz Jun 21 '12 at 15:50
  • 3 is the number of groups. 10 is the number of observations in each group, and it's just a guess. How many observations did sample1 have? – GSee Jun 21 '12 at 15:52
  • I don't think it matters here. You could make it `n=rep(1, 3)` if you wanted – GSee Jun 21 '12 at 15:54
1

This is a duplicate, however for posterity and since I already started writing...

dat <- data.frame(name=paste0('sample',1:3), min=c(1,1,2), max=c(38,39,36), mean=c(10,10,11), q25=c(8,9,10), q50=c(10,11,10), q75=c(13,14,13))

ggplot(dat, aes(x=name, ymin=min, ymax=max, lower=q25, middle=q50, upper=q75))+geom_boxplot(stat='identity')
Justin
  • 42,475
  • 9
  • 93
  • 111