3

I'm a real beginner and trying to analyze some data on the material loss on some metal tubes for my master thesis. I want to compare the standard deviation of the material loss over an interval for different tubes. I created some sub matrices and did a tapply to calculate the standard deviation.

I have the following script:

myfunctionSD <- function(mydata) { return(sd(mydata,na.rm=TRUE))}

Alltubes <- tapply(datIn$Material.loss.interval,
                   list(as.factor(datIn$Measurement.location),
                        as.factor(datIn$Tube.number)),
                   myfunctionSD)

of which the output is a table with on column header the Tube.number and the row title Measurement.location.

dput(head(Alltubes))

structure(c(0.871073958553372, NA, 0.697795091282526, NA, 0.838624866472886, 
NA, 0.726992791242471, 0.807567484588899, NA, 0.598675787394729, 
NA, 0.510990323891863, 0.81510216193526, NA, 1.09844645540173, 
NA, 0.839816448199645, NA, 0.63972340253115, NA, 1.11485875917537, 
NA, 0.883318358663128, NA, 0.86706340125676, NA, 1.21565055332783, 
NA, 1.24692213662875, NA, 0.704210691776757, NA, 0.962002980998362, 
NA, 1.00703215272093, NA), .Dim = c(6L, 6L), .Dimnames = list(
    c("1", "10", "11", "12", "13", "14"), c("1", "2", "3", "4", 
    "5", "6")))

Just to be clear: the 1,2,3,4,5,6 are the tube numbers and header of the columns and the 1, 10, 11...are the measurement locations and header of the rows.

I would like to make different plots of the data, one of them is: Boxplot with on the x-axis the tube number and the y-axis the spread of the derived standard deviation.

I have searched everywhere and have tried lots of different things, but I cant get a graph to appear.

Hope someone can help me, will be much appreciated!

r.j.mendel
  • 65
  • 1
  • 5
  • Show us how the data looks: `dput(head(Alltubes))`. Then we may be able to help you. – shadow Oct 15 '13 at 12:05
  • Shadow, I just edited the original message, hope this is a bit clear for you? There isnt always a calculated value for each measurement location, thus the many NA's – r.j.mendel Oct 15 '13 at 12:10
  • Thanks for providing data. Is `boxplot(Alltubes)` what you are looking for? – shadow Oct 15 '13 at 12:29
  • Wauw I tried about every difficult plot function and forgot to use the easiest way.... It is what is was looking for. Do you also know a quick one to instead of showing it as function of the column headers, as function of the row headers (Measurement.location). Thanks anyway, still cant believe it works! – r.j.mendel Oct 15 '13 at 12:33
  • and a line plot of the different tubes with on the x-axis the measurement location and the y axis the standard deviation. So in total 6 lines (one for each tube) – r.j.mendel Oct 15 '13 at 12:50
  • To get by rows wouldn't it just be boxplot( t(Alldata))? – IRTFM Oct 15 '13 at 13:31

1 Answers1

0

You don't need to create a special function to do what you want, you can pass na.rm=TRUE to tapply:

Alltubes <- tapply(datIn$Material.loss.interval,
                   list(as.factor(datIn$Measurement.location),
                        as.factor(datIn$Tube.number)),
                   sd, na.rm=TRUE)

Then you can use:

par(mfrow=c(2,2), mar=c(4,4,1,1), oma=c(1,1,1,1))

boxplot(Alltubes, names=colnames(Alltubes), xlab="Tube numbers",
        ylab="standard deviation")

boxplot(t(Alltubes), names=rownames(Alltubes), 
        xlab="Measurement locations", ylab="standard deviation")

matplot(colnames(Alltubes), t(Alltubes), xlab="Tube numbers", 
        ylab="standard deviation", type="b", lty=1, pch=19)

matplot(rownames(Alltubes), Alltubes, xlab="Measurement locations", 
        ylab="standard deviation", type="b", lty=1, pch=19)

theplot

Ricardo Oliveros-Ramos
  • 4,322
  • 2
  • 25
  • 42