-1

I have a variable X and and 16 groups of samples. I would like to know which group is the most associated to this variable (the one with the lowest values actually). I performed an ANOVA and a TukeyHSD/post-hoc but that only highlight which groups are different for variable X.

Is there a way to determine which group is significantly associated at lowest values for variable X ?

Thanks for your help

1 Answers1

0

With the post-hoc comparisons already in place, and with the information which groups differ from one another, all you need to know is the mean of X within each group. The group means are easily calculated in standard statistical software. You already know, which of those means are significantly different from one another.

Alternatively you can use a dummy coding for the group variable (i.e., 5 indicator variables with one reference group that replace the 6-level factor). A regression model that regresses X on the dummy variables is equivalent to the ANOVA model (in most parts) and allows for most pairwise comparisons (depending on the coding).

The regression coefficients will indicate the difference between groups, and the test for the coefficients will indicate whether or not these are significant on some level of confidence.

SimonG
  • 4,701
  • 3
  • 20
  • 31
  • Thanks for your quick answer ! BTW is 16 groups and not 6, I forgot the 1 but it doesn't change the problem. So what you suggest first is to look at the group with the lowest mean for X. The problem here is that I don't have one group significantly different from all others. So if I choose the group or groups with the smallest "X-mean" and the highest number of pval < 0.1 (with FDR correction) is that right ? – user1997740 Aug 15 '14 at 08:02
  • The global hypothesis testes in the ANOVA is whether any group is different from the others. If the global test is not significant, then probably no groups will differ significantly in pairwise comparison. – SimonG Aug 15 '14 at 08:08
  • The global test works on the sum of squares and is not equivalent to the "sum" of group comparisons (but to a multivariate test). This means: It need not be "the one group" that makes the F be high or low. In some cases, the cause for the significant/non-significant F is not easily detectable if not by means of post-hoc/group comparison. – SimonG Aug 15 '14 at 08:20
  • 1
    of course it is ! what I mean is that : >which(posthoc[[1]][,4] < 0.1) >[1] 5-1 6-1 7-1 5-2 6-2 7-2 9-5 8-6 9-6 9-7 This is the comparison that have a pvalue < 0.1 for THSD. Groups 5,6,7 have the highest means and 1,2,9 the lowest. What can I say from that results ? May I say 1,2 and 9 are the groups significantly associated at lowest values for variable X ? – user1997740 Aug 15 '14 at 08:21
  • You can state the results just as they are: (a) the global null can be refuted, (b) pairwise comparisons indicate differences between {567} and {129}, and (c) the groups {129} have the lowest mean with X={???}. I wouldn't speak of "significantly associated", however, but your general statement is to my understanding covered by your data. – SimonG Aug 15 '14 at 08:24