2

I'm conducting the marascuilio procedure in order to compare differences between proportions. I'm using the following code (copied and adapted from this tutorial:

## Set the proportions of interest.
p = c(0.3481, 0.1730, 0.4788)
N = length(p)
value = critical.range = c()

## Compute critical values.
for (i in 1:(N-1))
{ for (j in (i+1):N)
{
  value = c(value,(abs(p[i]-p[j])))
  critical.range = c(critical.range,
                     sqrt(qchisq(.95,3))*sqrt(p[i]*(1-p[i])/12000 + p[j]*(1-p[j])/12000))
}
}
round(cbind(value,critical.range),3)

I need that the output will print also the labels of the categories (e.g. which categories are exactly being compared).

So if the categories are listed in a seperated vector, e.g. categories <- c("cat1", "cat2", cat"3), the comparisons are cat1-cat2 , cat1-cat3, and cat2-cat3.

How can i append these labels to my output?

    value critical.range
[1,] 0.175          0.016
[2,] 0.131          0.018
[3,] 0.306          0.016
Dmitry Leykin
  • 485
  • 1
  • 7
  • 14

2 Answers2

2

Try this:

## Set the proportions of interest.
p = c(0.3481, 0.1730, 0.4788)
N = length(p)
value = critical.range = tag = c()
categories <- c("cat1", "cat2", "cat3")

## Compute critical values.
for (i in 1:(N-1)){ 
    for (j in (i+1):N){

    value <- c(value,(abs(p[i]-p[j])))
    critical.range = c(critical.range,
                       sqrt(qchisq(.95,N-1))*sqrt(p[i]*(1-p[i])/12000 + p[j]*(1-p[j])/12000))
    tag = c(tag, paste(categories[i], categories[j], sep = "-"))

    }
}
df <- as.data.frame(cbind(value,critical.range, tag), stringsAsFactors = F)
df$value <- round(as.numeric(df$value),3)
df$critical.range <- round(as.numeric(df$critical.range),3)

Output:

 value critical.range       tag
1 0.175          0.016 cat1-cat2
2 0.131          0.018 cat1-cat3
3 0.306          0.016 cat2-cat3
thepule
  • 1,721
  • 1
  • 12
  • 22
  • 1
    Great. I'm adding `df$significance <- ifelse(df$value>df$critical.range,"yes","no") ` to the `df` so it also will indicate whether the comparison is significant or not. – Dmitry Leykin Aug 23 '16 at 12:01
1

Be careful about the denominator in your calculation of the critical.range (12000)...that is based on your sample size for EACH category - if you don't have 12000 observations for each category, then that needs to be adjusted -if you have far fewer than 12000 observations, your critical values are probably much lower than what that function gave you (and consequently, you should have fewer sign. differences).

NightOwl888
  • 55,572
  • 24
  • 139
  • 212
user95146
  • 128
  • 9
  • The question asked was *How can i append these labels to my output?* This answer does not address the question. It should be a comment instead (which you can do once you get 50 reputation). – NightOwl888 Apr 05 '18 at 19:41