0

I have applied an R chi square test on a dataset with two nominal variables, namely subject category(SC) and Research Institution(RI). The table looks like this

       RI1  RI2   RI3  RI4  RI5  RI6   RI7   RI8  RI9 RI10
sc1   4.95 2.97  2.97 5.94 3.96 7.92 25.74 44.55 0.99 0.00
sc2   6.53 3.01 11.55 5.52 5.02 6.03 23.61 38.19 0.00 0.50
sc3   6.12 4.08 10.20 6.12 0.00 2.04 24.48 44.89 0.00 2.04
sc4  10.00 0.00  2.00 8.00 0.00 4.00 32.00 42.00 0.00 2.00
sc5  10.93 3.12  6.25 3.12 1.56 6.25 23.43 42.18 1.56 1.56
sc6   6.10 4.58 12.21 6.87 3.05 4.58 24.42 35.87 1.52 0.76
sc7  11.90 7.14 11.90 7.14 2.38 2.38 33.33 19.04 0.00 4.76
sc8   8.60 3.22  6.98 5.37 3.76 3.76 20.96 43.01 1.61 2.68
sc9   7.27 4.84 13.93 6.06 4.24 2.42 19.39 40.00 1.21 0.60
sc10  3.75 0.00  8.75 7.50 1.25 1.25 33.75 40.00 2.50 1.25

The chi-square results are as follows:

    chisq.test(mydata)

        Pearson's Chi-squared test

    data:  mydata
    X-squared = 102.51, df = 81, p-value = 0.05357
Warning message:
In chisq.test(mydata) : Chi-squared approximation may be incorrect

I would like to apply a Bonferroni correction on the p-value. My hypothesis is that subject category does not influence the number of publications in a research institute. My question is, since i have 10 subject categories, should i divide the p-value by 10?...

P.S. I have not yet reached 15 points therefore cannot create a new tag "Bonferroni correction"

JayPeerachai
  • 3,499
  • 3
  • 14
  • 29
tom sawyer
  • 47
  • 1
  • 2
  • 9
  • I'm confused why you believe you need to correct for multiple testing. – Roland Mar 24 '17 at 10:00
  • to get a pvalue cutoff using bonferroni, you would divide the pvalue threshold by the number of chi square tests. Then use the new pvalue as the threshold. For example pvalue threshold = 0.05 and number of tests = 10, then the new pvalue threshold will be 0.05/10 = 0.005. Any test has pvalue less than or equal to 0.005, then it is considered significant – Sathish Mar 24 '17 at 10:02
  • @Roland..I will repeat the chi-test for each row and therefore a need for correction – tom sawyer Mar 24 '17 at 10:06
  • @Sathish...Are you suggesting the i should do a chi-square test for each row? – tom sawyer Mar 24 '17 at 10:10

1 Answers1

1

If you are going to do multiple pairwise comparisons after your overall Chi Sq test, your Bonferroni correction would be .05/(number of tests). See helpful references here and here

You probably need to test all possible pairs, meaning that you'd be doing a lot more than 10 tests. However, before going ahead, you might want to think about other ways to tackle this.

The first step might be to re-assess your hypothesis. E.g. if your Research Institutions have different numbers of researchers, your analysis would need to take that into account (more researchers can be expected to produce more publications, regardless of subject). When you have clarified your research question, you might want to use another statistical method rather than Chi Sq. Helpful to search questions/answers at Cross Validated.

jules
  • 76
  • 6
  • I have considered the number of authors in the descriptive statistics of this study where i have divided the total number of publications by the number of authors to see which institution has a higher proportion of article per author. Any suggestions on statistical tests i can use to test correlation between the number of authors Vs. publication counts? – tom sawyer Mar 25 '17 at 17:50
  • Recommendation for stats tests is really outside scope of SO. Helpful to search answers on Cross Validated (linked in my answer) and then if necessary write a clear question there. Good luck. – jules Mar 26 '17 at 22:08