0

I am rusty with my stats knowledge, please correct me if I use the wrong terminology or misunderstand anything.

I am using adonis to perform a permanova test with the script:

nmds.div<- adonis2(nmds.dist ~ Season*Area, data = Type0, permutations = 999, method="bray")

Where Season has three levels (March, May, Sept) and Area has two levels (Pacific, Atlantic). The dependent variable is a distance matrix based on bray-curtis using OTU read counts. I want to see the interaction term(?) between Season and Area but this is what I get:

         Df SumOfSqs      R2      F Pr(>F)    
Season    2   6.4903 0.27066 8.9066  0.001 ***
Residual 48  17.4889 0.72934                  
Total    50  23.9792 1.00000  

When I run the same code format for Cruise and Layer3, the output table works fine and I get the interaction term - probability for Cruise:Layer3. Where Cruise has three levels (KS17, KS14 and HO15) and Layer3 has two levels (euphotic, aphotic).

nmds.div<- adonis2(nmds.dist ~ Cruise*Layer3, data = Type0, permutations = 999, method="bray")
              Df   SumOfSqs         R2        F Pr(>F)
Cruise         2  6.4903090 0.27066356 9.787264  0.001
Layer3         1  0.4029121 0.01680253 1.215168  0.311
Cruise:Layer3  2  2.1654176 0.09030381 3.265409  0.002
Residual      45 14.9206109 0.62223010       NA     NA
Total         50 23.9792496 1.00000000       NA     NA

Table produced by:

table(Type0$Season, Type0$Area)
        Pacific Atlantic
  Mar        16        0
  May        27        0
  Sept        0        8

So, my question is how come the same code works for Cruise*Layer3, but not for Season *Area? Are there restrictions with the independent variables?

locopoko
  • 5
  • 2
  • 1
    What is the content of your Area variable? Perhaps including some sample data in your question would make answering this easier. – xilliam Oct 20 '22 at 08:24
  • Hi, thank you! I just added some descriptions for the independent and dependent variables. – locopoko Oct 20 '22 at 09:04
  • To get a sense of the sample size of your data, could you describe the table produced by `table(Type0$Season, Type0$Area)`. – xilliam Oct 20 '22 at 10:01

1 Answers1

0

I think the short answer is that your model contains a high degree of multicolinearity because all of your "Sept" values came from the "Atlantic".

In other words, the additional factor of "Area" does not provide additional information, and so adonis2() drops a factor.

To see what I mean, here are two examples of simulated data. The first has the cell counts that match your data. Here you end up with a single factor in the result. 'Area' was dropped.

# fake data 1
nmds <- sample(1:1000, 51, replace = TRUE)
season <- factor(c(rep(1, 16), rep(2, 27), rep(3, 8)), 
                 labels= c("Mar", "May", "Sept"))
area <- factor(c(rep(1,43), rep(2,8)), labels = c("Pacific", "Atlantic"))
Type0 <- data.frame(nmds = nmds, Season =season, Area=area)

# cell counts
> table(Type0$Season, Type0$Area)
      
       Pacific Atlantic
  Mar       16        0
  May       27        0
  Sept       0        8

nmds.div1 <- adonis2(nmds ~ Season*Area, data = Type0, 
                   permutations = 999, method="bray")
> nmds.div1

adonis2(formula = nmds ~ Season * Area, data = Type0, permutations = 999, method = "bray")
         Df SumOfSqs      R2      F Pr(>F)
Season    2   0.1720 0.02919 0.7216  0.583
Residual 48   5.7204 0.97081              
Total    50   5.8924 1.00000             

In this second example, I provide random data in Area, which gives you greater-than-zero counts in all of the cells in the table. In this scenario the factors are no longer redundant. And adonis2() returns estimates for both factors and the interaction.

# fake data 2
nmds <- sample(1:1000, 51, replace = TRUE)
season <- factor(c(rep(1, 16), rep(2, 27), rep(3, 8)), 
                 labels= c("Mar", "May", "Sept"))
set.seed(1)
area <- factor(sample(1:2, 51, replace = TRUE), labels = c("Pacific", "Atlantic"))
Type0 <- data.frame(nmds = nmds, Season =season, Area=area)

# cell counts
> table(Type0$Season, Type0$Area)

Pacific Atlantic
Mar       11        5
May       14       13
Sept       2        6


nmds.div2 <- adonis2(nmds ~ Season*Area, data = Type0, 
                   permutations = 999, method="bray")

> nmds.div2
adonis2(formula = nmds ~ Season * Area, data = Type0, permutations = 999, method = "bray")
Df SumOfSqs      R2      F Pr(>F)
Season       2   0.2721 0.04736 1.1661  0.313
Area         1   0.1721 0.02995 1.4747  0.233
Season:Area  2   0.0515 0.00895 0.2205  0.948
Residual    45   5.2510 0.91374              
Total       50   5.7467 1.00000  
xilliam
  • 2,074
  • 2
  • 15
  • 27
  • Thank you so much for the detailed response! I think that is the case too... even when I try adonis2 without testing for the interaction term, by using the + operator instead of * it still drops certain factors. I guess I will have to select the factors I test carefully. – locopoko Oct 23 '22 at 05:27