3

sorry to post this as I know this has come up before in various guises but I really don't understand what I am doing wrong/the inner-workings of R!

I have an (multi dimensional) array of data that I have read in from a netcdf file that I am playing around with. I would like to calculate some "stats" on parts of the array for example:

data <- array(runif(96*73*26*12), dim=c(96,73,26,12))

part.mean <- apply(data[10:23, 42:56, ,], c(3,4), mean)

Works great. But:

part.sd <- apply(data[10:23, 42:56, ,], c(3,4), sd)

Fails.

What is the correct way then to subset my array and calculate the sd associated with the mean that I can calculate above?

Thanks for your time!

Alex

Roman Luštrik
  • 69,533
  • 24
  • 154
  • 197
Alex Archibald
  • 407
  • 1
  • 6
  • 14
  • In the manual (`?sd`), it's written that using sd with a matrix is deprecated and you have to use it this way : `sapply(x, sd)`. So that, your code becomes : `apply(data[10:23, 42:56, ,], c(3,4), function(x){sapply(x, sd)})` – Pop Aug 08 '12 at 11:09
  • 1
    Or simply `part.sd <- apply(data[10:23, 42:56, ,], c(3,4), function(x) sd(as.vector(x))` to be consistant with the `mean` function – dickoa Aug 08 '12 at 11:09
  • @dickoa I believe that is the correct answer. Do you want to post it as such? – Andrie Aug 08 '12 at 11:10
  • What version of R are you running? In 2.15 or 2.15.1 a warning is issued with instructions on how to proceed. – Roman Luštrik Aug 08 '12 at 11:12
  • @Andrie : James was faster than me...and it doesn't matter anyway, because we the most important it's have the right information :) – dickoa Aug 08 '12 at 11:13

1 Answers1

5

sd works differently with matricies than mean does. It produces column standard deviations rather than of the whole matrix,

part.sd <- apply(data[10:23, 42:56, ,], c(3,4), function(x) sd(as.vector(x)))

shoud give you a result consistent with the result for the mean.

James
  • 65,548
  • 14
  • 155
  • 193