I want to calculate the effect size for a number of pairwise comparisons in R. The data is paired: the df consists of 115 respondents who each rated 7 entities on a scale of 1-4. I transformed the df to the melted (long) format (here an imaginary example):
respondent entity rating
1 1 2
1 2 4
1 3 1
1 4 3
1 5 4
1 6 3
1 7 4
2 1 3
2 2 3
2 3 2
2 4 4
2 5 1
2 6 3
2 7 2
.......
The goal is to compare the rating of each of the 7 entities to the other 6. I've already conducted pairwise comparisons using pairwise.wilcox.test(data$rating, data$entity, p.adjust.method = "holm", exact = FALSE, paired = TRUE)
, which worked fine. I then wanted to calculate the effect size of each comparison using wilcox_effsize()
:
df=data %>% wilcox_effsize(rating~entity, paired = TRUE) %>% as.data.frame())
Since I wasn't quite sure whether I was using this function correctly, I compared the results to the effect sizes which I calculated 'manually' using the function provided by Field et al. (2012, p. 665):
rFromWilcox<-function(wilcoxModel, N){
z<- qnorm(wilcoxModel$p.value/2)
r<- z/ sqrt(N)
cat(wilcoxModel$data.name, "Effect Size, r = ", r)
}
(I entered 230 as the N, since 115*2 = 230, and I computed several "wilcoxModels" using the data in wide format and the following code; I then entered each model into the function:
model_entity1vs2 = wilcox.test(data$rating_ent1, data$rating_ent2, paired = TRUE, exact = FALSE)
Data in wide format:
respondent rating_ent1 rating_ent2 rating_ent3 .......
1 1 2 3
2 2 4 4
3 3 1 2
4 4 3 1
5 4 4 2
6 2 3 1
.... .... ... ...
115 1 3 2
And then for each comparison:
rFromWilcox(model_entity1vs2, 230)
I then noticed that the effect sizes which I calculated manually were substantially smaller than the ones in the output of wilcox_effsize()
(e.g. r = 0.35 vs. r = 0.52), so something must be wrong. I first thought that the function might have a problem with the long format and might have misread the sample size, but the output gives n1 = 115 and n2 = 115, so that seems to be in order:
y group 1 group 2 effsize n1 n2 magnitude
rating entity1 entity2 0.52345 115 115 large
rating entity1 entity3 0.51847 115 115 large
rating entity1 entity4 0.38759 115 115 moderate
...
Something is clearly wrong but I'm not sure what... Did I make a mistake in my code when using
wilcox_effsize()
for paired data?