-1

I am trying to visualise my data using R box plot and got stuck. My data looks as follows:

id  var.1 var.2 ... var.n value
a   0     1     ... 2     1.7
b   2     1     ... 0     1.4
... ...   ...   ... ...   ...
a   1     2     ... 2     5.3
b   1     2     ... 1     2.4

Now, I would like to have a series of boxplots: value~var.1, value~var.2, ..., value~var.n Preferably a ggplot2 facet type plot. My attempts using melt, reshape and split miserably failed -- would appreciate someone with fresh mind giving a hint here...

I guess reshaped data should be of the form:

a var.1 0 1.7
a var.2 1 1.7
...
b var.1 2 1.4
b var.2 1 1.4
...

so that I can look use interaction of columns 2 & 3...

kiero
  • 121
  • 1
  • 7

2 Answers2

1

Data

set.seed(1)
dat <- do.call(cbind.data.frame, rep(list(gl(3, 10)), 5))
names(dat) <- paste("var", 1:5, sep = ".")
dat$value <- rnorm(30)
head(dat)
#   var.1 var.2 var.3 var.4 var.5      value
# 1     1     1     1     1     1 -0.6264538
# 2     1     1     1     1     1  0.1836433
# 3     1     1     1     1     1 -0.8356286
# 4     1     1     1     1     1  1.5952808
# 5     1     1     1     1     1  0.3295078
# 6     1     1     1     1     1 -0.8204684

Method

First, we need to transform the data into something with which ggplotcan work conveniently. You could use reshape from the base package (addmitedly, the syntay is not self-explanatory, and I need to do trial-and-error everytime I use the syntax):

datm <- reshape(dat, direction = "long", varying = paste("var", 1:5, sep = "."), 
                new.row.names = 1:((ncol(dat) - 1) * nrow(dat)), timevar = "i", 
                v.names = "x")
head(datm)
#        value i x id
# 1 -0.6264538 1 1  1
# 2  0.1836433 1 1  2
# 3 -0.8356286 1 1  3
# 4  1.5952808 1 1  4
# 5  0.3295078 1 1  5
# 6 -0.8204684 1 1  6

Now, you could do your boxplot(value ~ x) via means of ggplot2:

library(ggplot2)
ggplot(datm, aes(x = x, y = value)) + geom_boxplot() + facet_wrap(~i)

enter image description here

Does that answer your question?

thothal
  • 16,690
  • 3
  • 36
  • 71
1

Can also use melt function:

library(reshape2)
melt(dat, id='value')
          value variable value
1   -0.11978146    var.1     1
2   -0.78996525    var.1     1
3    0.54246428    var.1     1
4    0.09325227    var.1     1
5    0.63954407    var.1     1
6    1.48611802    var.1     1
...
rnso
  • 23,686
  • 25
  • 112
  • 234
  • I combined both answers and, yes, it works now! Need to play more with reshape2. Useful thing to master. – kiero Oct 28 '14 at 15:56