2

I am trying to predict the count for a particular data set over varying time periods. Hence, I'm using an offset to account for these different time periods.

However, while trying to fit the model I get this error: LinAlgError: Singular matrix I don't know what this means. This is my code and sample dataset

glm = statsmodels.formula.api.gee
model = glm("Number_Of_Claims ~ Number_Of_Users + Number_Of_Vehicles + 
Total_Miles + Category_Adults + Category_Business + Category_Senior + 
Category_MedExp + Territory_1 + Territory_2 + Territory_3 + Territory_4 + 
Territory_5 + Liability_Exposure + PIP_Exposure ", groups=None, 
data=train_init, offset = train_init.Exposure_Term, 
family=Poisson())

results = model.fit()

This is my sample data Sample of training dataset

My continuous variables are Number_Of_Users, Number_Of_Vehicles and Total Miles The rest are dummy variables

Niranjan
  • 23
  • 4
  • After looking [here](https://www.statsmodels.org/dev/gee.html) I can supposed you to specify parameters `cow_struct` and `family`. Some [examples](https://www.statsmodels.org/dev/generated/statsmodels.genmod.generalized_estimating_equations.GEE.html#statsmodels.genmod.generalized_estimating_equations.GEE) from docs here maybe can help you – Dmitriy Kisil Sep 18 '18 at 11:12
  • 2
    Problem Solved. My offset in the GLM was time in days which is too high and frankly does not make sense. Hence, I took the time in months and the model works fine – Niranjan Sep 18 '18 at 11:23

0 Answers0