1

I run the pairwise.wilcox.test() on a data with many ties, I get the following warning:

Warning in wilcox.test.default(xi, xj, paired = paired, ...) :
  cannot compute exact p-value with ties

I would like to know how does wilcox.test() handle the ties?

What method is used (by default) to rank the observations?

What does "P value adjustment method: holm" mean?

SteveMcManaman
  • 391
  • 1
  • 17

1 Answers1

1

When there are ties, wilcox.test uses a Normal approximation. You can see the code here: here is a slightly simplified version.

## example values
x <- 1:5
y <- 2:6
## assumes mu=0
r <- c(x, y)
## slightly simplified (assumes `digits.rank` is equal to its default `Inf` value)
r <- rank(r)
NTIES <- table(r)
n.x <- length(x)
n.y <- length(y)
STATISTIC <- c("W" = sum(r[seq_along(x)]) - n.x * (n.x + 1) / 2)
z <- STATISTIC - n.x * n.y / 2
SIGMA <- sqrt((n.x * n.y / 12) *
            ((n.x + n.y + 1)
             - sum(NTIES^3 - NTIES) ## this will be zero in the absence of ties
                           / ((n.x + n.y) * (n.x + n.y - 1))))
## stuff about continuity correction omitted here
z <- z/SIGMA ## z-score, used to compute p-value
2*pnorm(z)  ## 2-tailed p-value (skipped testing whether in lower or upper tail)

This gives the same p-value as wilcox.test(x, y, correct = FALSE).

As for p-value adjustment ("holm"), this points you to the help page for ?p.adjust, which says that it is using the method from Holm (1979). You can find out more about the method here (for example).

Holm, S. (1979). A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics, 6, 65-70. https://www.jstor.org/stable/4615733.

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453