2

Good day. I am search for some help/suggestions with a data set I have, for which I want to run a Mann-Whitney U test. A dummy set of the data.frame looks like this:

 Plant R1 R2 R3 R4 R5
     a  1  2  3  4  5
     a  6  7  8  9 10
     a 11 12 13 14 15
     b 16 17 18 19 20
     b 21 22 23 24 25
     b 26 27 28 29 30
     b 31 32 33 34 35
     c 36 37 38 39 40
     c 41 42 43 44 45
     c 46 47 48 49 50
     d 51 52 53 54 55
     d 56 57 58 59 60

I have 26 different plants and I would like to test the hypothesis that between all pairs of plant species (a,b,c...etc) there is no significant difference between the median reflectance of each individual waveband (r1,r2,r3...r400.There are 400 waveband columns). The hypothesis is to be tested 325 times for all possible combination of 26 plant species. The null hypothesis should be tested at significance level of ά = 0.00015 (to correct for the Bonferroni effect, 0.05/325).

I am aware of the wilcox.test command to perform a pairwise comparison. I tried searching the Cran repository and found npmc package, but it is no longer maintained.

I would like the result to look like this:

Comparison   R1   R2   R3   R4   R5
    ab      p-value
    ac
    ad

But I have no idea where to begin. Can anyone offer any suggestions please. Thanks in advance.

Kurt

user2507608
  • 355
  • 1
  • 6
  • 18
  • I would start with `split.data.frame` and with a triple `for` loop, one loop over the columns and the other two over pairs of plants ... – Ben Bolker Jul 12 '13 at 19:03
  • 2
    I think you are stepping into a swamp with several statistical alligators. I predict that the reflectance measures are highly correlated across wavebands within subjects and there are of course the more obvious multiple-comparisons problems. I think you need statistical advice more than coding solutions. – IRTFM Jul 12 '13 at 19:30
  • Related to [How to test for non-parametric silmultaneous inference in R](http://stackoverflow.com/questions/14181953/how-to-test-for-non-parametric-silmultaneous-inference-in-r/14182834#14182834) – Alexander Serebrenik Jul 12 '13 at 20:59

5 Answers5

2

Since you are doing multiple comparisons you can consider multiple contrast test procedures, such as T~ described by Frank Konietschke, Ludwig A. Hothorn, and Edgar Brunner. Since you are interested in comparing all possible pairs you should use Tukey contrasts. Discussion of the statistical machinery behind T~ is, probably, not appropriate for StackOverflow and be better done on Cross Validated. The T~ procedure has been implemented in the nparcomp package. Since T~ respects transitivity, its results can be presented as a simplified graph as suggested by Vasilescu et al.

Alexander Serebrenik
  • 3,567
  • 2
  • 16
  • 32
  • The multiple comparisons problem will not be solved with those procedures (and may even be exacerbated by them) because of the auto-correlation issues that make the independence assumption needed for those methods completely untenable. (This would be a great answer to a different question.) – IRTFM Jul 12 '13 at 20:35
  • @DWin: I do not know much about plants and wavebands, but can you explain why do you believe the wavebands not to be independent? The original question does not seem to be clear on this point. – Alexander Serebrenik Jul 12 '13 at 20:54
  • I suspect you know more than you admit: Most plants are green. If you want to look at a measured reflectance across a range of wavelengths in the visible section of the electromagnetic spectrum here's a citation that does it for several different plant species. See page 680; http://www.amjbot.org/content/88/4/677.full.pdf+html – IRTFM Jul 12 '13 at 21:12
  • @DWin: Thank you! I did not expect this to be related to colors, somehow I expected the word waveband to have a more exotic meaning. – Alexander Serebrenik Jul 12 '13 at 21:14
1

I managed this using

 ttest<- pairwise.wilcox.test(ttest.data[,i],Species,conf.level = 0.95, p.adj = "bonf")
 library(reshape)
 ttest.result<- melt (ttest[[3]])
user2507608
  • 355
  • 1
  • 6
  • 18
0

It sounds like you should look into applying Dunn's test. Briefly, Dunn's test is a post-hoc group-by-group difference of location test (using multiple test correction) that can be applied if you reject the null hypothesis in a Kruskal-Wallis test (and determine there is at least one group drawn from a different distribution).

See this answer on Cross Validated for a more detailed example. There is an R-package (dunn.test) that provides a dunn.test method which uses a similar interface to wicox.test

Community
  • 1
  • 1
LeeZamparo
  • 427
  • 4
  • 9
0

Try to use DepthProc library for R

    library(DepthProc)
    x <- mvrnorm(100, c(0,0), diag(2))
    y <- mvrnorm(100, c(0,0), diag(2)*1.4)
    mWilcoxonTest(x,y)

It's a multivariate analog to Wilcoxon-Mann-Whitney test 3 based on data depth concept 2:
https://projecteuclid.org/download/pdf_1/euclid.aos/1176344722
https://projecteuclid.org/download/pdfview_1/euclid.ss/1113832733

savinson
  • 1,247
  • 1
  • 8
  • 5
0

What I suggest, is to run a permutational MANOVA, using some distance, as the euclidean.

Then, if the multivariate assumption is hold, perform a Hotelling T test, if not, you can use the permutational Hotelling test.