Marascuilo procedure in R

Question

I'm conducting the marascuilio procedure in order to compare differences between proportions. I'm using the following code (copied and adapted from this tutorial:

## Set the proportions of interest.
p = c(0.3481, 0.1730, 0.4788)
N = length(p)
value = critical.range = c()

## Compute critical values.
for (i in 1:(N-1))
{ for (j in (i+1):N)
{
  value = c(value,(abs(p[i]-p[j])))
  critical.range = c(critical.range,
                     sqrt(qchisq(.95,3))*sqrt(p[i]*(1-p[i])/12000 + p[j]*(1-p[j])/12000))
}
}
round(cbind(value,critical.range),3)

I need that the output will print also the labels of the categories (e.g. which categories are exactly being compared).

So if the categories are listed in a seperated vector, e.g. categories <- c("cat1", "cat2", cat"3), the comparisons are cat1-cat2 , cat1-cat3, and cat2-cat3.

How can i append these labels to my output?

    value critical.range
[1,] 0.175          0.016
[2,] 0.131          0.018
[3,] 0.306          0.016

What should the output dataframe look like in the example above? — thepule, Aug 23 '16 at 11:24
I added an example, so basically another column should be added and carry the information about categories comparisons — Dmitry Leykin, Aug 23 '16 at 11:30

thepule · Answer 1 · 2016-08-23T11:44:52.877

Try this:

## Set the proportions of interest.
p = c(0.3481, 0.1730, 0.4788)
N = length(p)
value = critical.range = tag = c()
categories <- c("cat1", "cat2", "cat3")

## Compute critical values.
for (i in 1:(N-1)){ 
    for (j in (i+1):N){

    value <- c(value,(abs(p[i]-p[j])))
    critical.range = c(critical.range,
                       sqrt(qchisq(.95,N-1))*sqrt(p[i]*(1-p[i])/12000 + p[j]*(1-p[j])/12000))
    tag = c(tag, paste(categories[i], categories[j], sep = "-"))

    }
}
df <- as.data.frame(cbind(value,critical.range, tag), stringsAsFactors = F)
df$value <- round(as.numeric(df$value),3)
df$critical.range <- round(as.numeric(df$critical.range),3)

Output:

 value critical.range       tag
1 0.175          0.016 cat1-cat2
2 0.131          0.018 cat1-cat3
3 0.306          0.016 cat2-cat3

Great. I'm adding `df$significance <- ifelse(df$value>df$critical.range,"yes","no") ` to the `df` so it also will indicate whether the comparison is significant or not. — Dmitry Leykin, Aug 23 '16 at 12:01

score 1 · Accepted Answer · edited Apr 05 '18 at 19:39

1

Be careful about the denominator in your calculation of the critical.range (12000)...that is based on your sample size for EACH category - if you don't have 12000 observations for each category, then that needs to be adjusted -if you have far fewer than 12000 observations, your critical values are probably much lower than what that function gave you (and consequently, you should have fewer sign. differences).

edited Apr 05 '18 at 19:39

NightOwl888

55,572
24
139
212

answered Apr 05 '18 at 18:50

user95146

128
9

The question asked was *How can i append these labels to my output?* This answer does not address the question. It should be a comment instead (which you can do once you get 50 reputation). – NightOwl888 Apr 05 '18 at 19:41

Marascuilo procedure in R

2 Answers2