1

I found some suggestions throughout the web that suggest that most common statistical tests can be performed using general(ized) linear models (cf. here). The author suggests to first transform the data into signed ranks (as the wilcoxon test does) and then perform a linear model:

signed_rank = function(x) sign(x) * rank(abs(x))

# one-sample test
summary(lm(signed_rank(y) ~ 1))

# two-sample test
summary(lm(signed_rank(z - y) ~ 1)

However, due to this rank transformation, the residuals of the linear models are then of course no longer normal. Thus, this general assumption of linear models is not fulfilled. Thus, I wonder whether you can suggest an GLM (generalized linear model) alternative. How would the code look like in R?

Thank a lot for your help in advance!

Anti
  • 365
  • 1
  • 14
  • It's not clear what your goal is here. I don't think anyone was claiming that the signed ranks would satisfy the assumptions of linear models, just that a linear model on signed ranks was **approximately equivalent** to a Wilcoxon test. – Ben Bolker Apr 23 '22 at 16:08
  • @BenBolker Yeah, maybe. But the p value "estimation" can be quite biased then. To make the approximation of multiple tests by linear models, one has to find a way to estimate p values correctly, right? Otherwise it doesn't make sense to use such a test at all (in terms of hypotheses testing). Thus, I wonder whether a GLM (e.g. assuming a beta link function could be applied here). Of course, then the original data would need ot be adjusted to a range of [0, 1]. (?!) Could that be a valid procedure (from hypotheses point of view) and what would be the R formula to apply such a GLM? – Anti Apr 23 '22 at 17:49
  • 1
    Seems like a stretch/might be a better question for [CrossValidated](https://stats.stackexchange.com) ... (wouldn't solve the "what's the R code for this?" question but would address the more fundamental "is there a sensible way to formulate this?" question ...) – Ben Bolker Apr 24 '22 at 19:54

0 Answers0