1

I have 4 files each called 0_X_cell.csv, 0_S_cell.csv and 15_X_cell.csv, 15_S_cell.csv of the format:

   p    U:0      U:1         U:2    Tracer  Tracer_0    U_0:0
-34.014 0.15268 -3.7907 -0.20155    10.081  10.032      0.12454
-33.836 0.07349 -2.1457 -0.30531    27.706  27.278      0.076542

I'd like to create boxplots out of the values for Tracer/3600 and put them on the same graph using ggplot2 but I'm finding it not quite so straightforward. Any suggestions would be much appreciated:

enter image description here

I'm thinking it might something like this:

  1. Import data from all files into separate variables:
  2. Extract Tracer from each one and put into a data.frame
  3. Plot the boxplots of every column Tracer/3600. But each column will be called Tracer...

What would the correct procedure be?

jeremycg
  • 24,657
  • 5
  • 63
  • 74
HCAI
  • 2,213
  • 8
  • 33
  • 65

2 Answers2

2

Here's one way to do it (if I understood you correctly):

`0_X_cell.csv` <- `0_S_cell.csv` <- `15_X_cell.csv` <- `15_S_cell.csv` <- read.table(header=T, text="
  p    U:0      U:1         U:2    Tracer  Tracer_0    U_0:0
-34.014 0.15268 -3.7907 -0.20155    10.081  10.032      0.12454
-33.836 0.07349 -2.1457 -0.30531    27.706  27.278      0.076542")
lst <- mget(grep("cell.csv", ls(), fixed=TRUE, value=TRUE))
df <- stack(lapply(lapply(lst, "[", "Tracer"), unlist))
df$ind <- sub("^(\\d+_[A-Z]).*$", "\\1", df$ind)
library(ggplot2)
ggplot(df, aes(ind, values/3600)) + geom_boxplot()
lukeA
  • 53,097
  • 5
  • 97
  • 100
  • Thank you very much for this. I actually have quite a few files but all of the same structure and naming convention. Could you make it a little more genetic please? – HCAI Dec 13 '15 at 13:14
1

To read in the data from your dir:

z <- list.files(pattern = ".*cell\\.csv$")
z <- lapply(1:length(z), function(x) {chars <- strsplit(z[x], "_");
         cbind(data.frame(Tracer = read.csv(z[x])$Tracer), time = chars[[1]][1], treatment = chars[[1]][2])})
z <- do.call(rbind, z)

Then plot it:

library(ggplot2)

ggplot(z, aes(y = Tracer/3600, x = factor(time))) +geom_boxplot(aes(fill = factor(treatment))) + ylab("Tracer")

enter image description here

jeremycg
  • 24,657
  • 5
  • 63
  • 74
  • Thank you, this is fantastic. I do get one error though: Error: ggplot2 doesn't know how to deal with data of class list. What do you think this is referring to? Cheers – HCAI Dec 12 '15 at 18:24
  • did you include the `z <- do.call(rbind, z)`? This step changes the list to a data.frame – jeremycg Dec 12 '15 at 18:34
  • Thank you, I think that must have been missing. Is there a way of dealing with data of different lengths? `Error in rbind(deparse.level, ...) : numbers of columns of arguments do not match` – HCAI Dec 13 '15 at 11:40
  • Does this make sense to you at all? It seems a bit cryptic... `> warnings() Warning messages: 1: In Ops.factor(left) : ‘+’ not meaningful for factors ` – HCAI Dec 13 '15 at 11:49
  • Ahhh I realised that my filled are all of different lengths but still same format. E.g 1000000 rows in 0_S... and 500000 in 0_X.... – HCAI Dec 13 '15 at 13:38
  • 1
    it looks like you have different numbers of columns in different files, try the edit above. – jeremycg Dec 13 '15 at 16:49
  • Thank you very much for your help, it's much appreciated. It works and looks good. One thing that isn't working too well is changing the width of the boxplots. Any suggestions? – HCAI Dec 14 '15 at 19:47
  • Try adding `width = 0.5` inside the `geom_boxplot` call – jeremycg Dec 15 '15 at 13:43