I am trying to do a two-way ANOVA, where I am trying to find the importance of two variables (B and M) on the classification of samples (given by the parameter C).
I am trying to reshape the data frame to make it suitable for statsmodels
package. However, I have only been able to include one variable at a time (either B or M) using pd.melt.
Any suggestion on how can I use the values of both variables to perform the two-way ANOVA (in a way like the last two lines of the code given below) would be a great help.
The values of B, M and C:
B : [10.,4.,4.,6.,5.]
M : [9.,6.,8.,4.,6.]
C : [1.,2.,2.,3.,1.]
import numpy as np
import pandas as pd
import statsmodels.api as sm
from statsmodels.formula.api import ols
d = pd.read_csv("/Users/Hrihaan/Desktop/Data.txt", sep="\s+")
d_melt = pd.melt(d, id_vars=['C'], value_vars=['B'])
#model = ols('C ~ C(B) + C(M) + C(B):C(M)', data=d_melt).fit()
#anova_table = sm.stats.anova_lm(model, typ=2)