How to write formulas for linear mixed effects models in Python (Statsmodels)?

Question

Bear with me as I'm new to this level of statistics and to Python. I've read all the documents from statsmodels and patsy but still have doubts.

I am trying to analyse longitudinal data using statsmodels MixedLM. Simplified a bit, I have 5 variables, with no collinearity between independent variables:

Outcome: the dependent variable.
Patient: the random effect, as each patient has multiple measurements of the outcome
Time: a fixed effect
Targeted: a fixed effect, 0 = no, 1= yes, whether or nor the patient was targeted for an intervention to address the outcome
Sex: a fixed effect, 0=male, 1 = female

I want to know 2 things:

Is there an association between whether the patient was targeted and the outcome trends over time?
Is there an association between patient sex and outcome trends over time, among the targeted group only?

Maybe important: I'm not actually trying to make any predictions. Just accurately explain the data that I already have.

To answer the first question, I tried:

md = smf.mixedlm('outcome ~ time * targeted', df, groups = df['patient'])

Is this notation correct? Or should I use:

md = smf.mixedlm('outcome ~ time : targeted', df, groups = df['patient'])

to better compare the difference in outcome trends? Or something else?

To answer the second question, I tried:

md = smf.mixedlm('outcome ~ time * targeted * sex', df, groups = df['patient'])

But I don't think this is correct because the coefficients don't make sense. Patients who are targeted need to have a starting outcome of >6, but the coefficient for targeted:sex is < 6. One solution is to make a separate dataframe that includes only the targeted patients, but I'm curious if there are operators I can use differently here to get what I want.

Thank you!

Did you manually add the intercept term to your model? I think this is required in statsmodel — Peter, May 24 '20 at 21:45
@Peter I could definitely be wrong, but I think MixedLM includes an intercept by default which I've been interpreting as the outcome at time 0 for the non-targeted group (of males, in the second question). I didn't add a random intercept, though, if that's what you mean. — EMG, May 24 '20 at 21:57
Hi @EMG, if you are open to Bayesian statistics you could try Bambi https://github.com/bambinos/bambi. It allows you to completely specify a mixed model with the formula interface, without having to use other arguments. — Tomas Capretto, Apr 09 '21 at 14:09

How to write formulas for linear mixed effects models in Python (Statsmodels)?

0 Answers0