I have to analyze my dataframe on python. I have uploaded an image of what my table looked like. I need to fit the "stage"(dependent variable), "overallscore"(independent variable), "spatialreasoning"(independent variable) & "numericalmemory"(independent variable) into a mixed effects linear-regression model.
This is the site I have used for guidance: https://www.pythonfordatascience.org/mixed-effects-regression-python/
I tried following their approach:
!pip install -q statsmodels
import numpy as np
import pandas as pd
import statsmodels.api as sm
import statsmodels.formula.api as smf
mf=pd.read_csv('mca.csv')
mf.dropna(inplace=True)
print(mf)
smf.mixedlm("C(stage) ~ overallscore + spatialreasoning + numericalmem",
data = mf,groups="group").fit()
But I'm getting this error, "ValueError: endog has evaluated to an array with multiple columns that has shape (60, 2). This occurs when the variable converted to endog is non-numeric (e.g., bool or str)."
I'm not sure how this could be the case as all of columns I have entered only contain integers. I would really appreciate help with how to rectify this error or whether there's a different approach I can try.
I tried converting the columns to numeric values anyway and then re-running it but it didn't seem to work :/ still getting the same error.
mf['stage'] = pd.to_numeric(mf['stage'])
mf['overallscore'] = pd.to_numeric(mf['overallscore'])
mf['spatialreasoning'] = pd.to_numeric(mf['spatialreasoning'])
mf['numericalmem'] = pd.to_numeric(mf['numericalmem'])