Two Way Anova using Python

Question

I am trying to do a two-way ANOVA, where I am trying to find the importance of two variables (B and M) on the classification of samples (given by the parameter C).

I am trying to reshape the data frame to make it suitable for statsmodels package. However, I have only been able to include one variable at a time (either B or M) using pd.melt.

Any suggestion on how can I use the values of both variables to perform the two-way ANOVA (in a way like the last two lines of the code given below) would be a great help.

The values of B, M and C:

B : [10.,4.,4.,6.,5.]
M : [9.,6.,8.,4.,6.]
C : [1.,2.,2.,3.,1.]

import numpy as np
import pandas as pd
import statsmodels.api as sm
from statsmodels.formula.api import ols
d = pd.read_csv("/Users/Hrihaan/Desktop/Data.txt", sep="\s+")
d_melt = pd.melt(d, id_vars=['C'], value_vars=['B'])
#model = ols('C ~ C(B) + C(M) + C(B):C(M)', data=d_melt).fit()
#anova_table = sm.stats.anova_lm(model, typ=2)

why would you convert B and M to categorical? And what exactly is C? — StupidWolf, Sep 19 '20 at 21:43

score 0 · Answer 1 · answered Dec 14 '21 at 15:15

0

You were close to the answer:

B = [10.,4.,4.,6.,5.]
M = [9.,6.,8.,4.,6.]
C = [1.,2.,2.,3.,1.]

import numpy as np
import pandas as pd
import statsmodels.api as sm
from statsmodels.formula.api import ols

d = pd.DataFrame()
d["B"]=B
d["M"]=M
d["C"]=C
model = ols("C ~ B + M + B:M",data = d).fit()
anova_table = sm.stats.anova_lm(model, typ=2)

You create a dataframe, you set your model, you perform the Anova

answered Dec 14 '21 at 15:15

user4624500

286
1
10

this treats `B` and `M` as numeric and not as categorical – Josef Dec 14 '21 at 18:14

Two Way Anova using Python

1 Answers1