1

I have to analyze my dataframe on . I have uploaded an image of what my table looked like. I need to fit the "stage"(dependent variable), "overallscore"(independent variable), "spatialreasoning"(independent variable) & "numericalmemory"(independent variable) into a mixed effects model.

This is the site I have used for guidance: https://www.pythonfordatascience.org/mixed-effects-regression-python/

I tried following their approach:

!pip install -q statsmodels
import numpy as np
import pandas as pd
import statsmodels.api as sm
import statsmodels.formula.api as smf

mf=pd.read_csv('mca.csv')
mf.dropna(inplace=True)

print(mf)
smf.mixedlm("C(stage) ~ overallscore + spatialreasoning + numericalmem",
        data = mf,groups="group").fit()

But I'm getting this error, "ValueError: endog has evaluated to an array with multiple columns that has shape (60, 2). This occurs when the variable converted to endog is non-numeric (e.g., bool or str)."

I'm not sure how this could be the case as all of columns I have entered only contain integers. I would really appreciate help with how to rectify this error or whether there's a different approach I can try.

I tried converting the columns to numeric values anyway and then re-running it but it didn't seem to work :/ still getting the same error.

mf['stage'] = pd.to_numeric(mf['stage'])
mf['overallscore'] = pd.to_numeric(mf['overallscore'])
mf['spatialreasoning'] = pd.to_numeric(mf['spatialreasoning'])
mf['numericalmem'] = pd.to_numeric(mf['numericalmem'])
Anusha
  • 11
  • 2

1 Answers1

0

You need to specify the re_formula parameter for the random effects structure.

mf = pd.DataFrame(data)

model = smf.mixedlm("stage ~ overallscore + spatialreasoning + numericalmem",
                    data=mf, groups="group", re_formula="1")
result = model.fit()
Abdulmajeed
  • 1,502
  • 2
  • 10
  • 13
  • However it stops working when I specify that stage is a categorical variable. I get the same ValueError if I'm doing: model = smf.mixedlm("C(stage) ~ overallscore + spatialreasoning + numericalmem", data=mf, groups="group", re_formula="1") Is it going to make a difference if I do it without specifying that stage is categorical? As it does seem to work without it – Anusha Mar 26 '23 at 15:42