Questions tagged [statsmodels]

Statsmodels is a Python module that allows users to explore data, estimate statistical models, and perform statistical tests.

Homepage: http://www.statsmodels.org/

An extensive list of descriptive statistics, statistical tests, plotting functions, and result statistics are available for different types of data and each estimator. Features include:

  • Linear regression models
  • Generalized linear models
  • Discrete choice models
  • Robust linear models
  • Many models and functions for time series analysis
  • Nonparametric estimators
  • A collection of datasets for examples
  • A wide range of statistical tests
  • Input-output tools for producing tables in a number of formats (Text, LaTex, HTML) and for reading Stata files into NumPy and Pandas.
  • Plotting functions
  • Extensive unit tests to ensure correctness of results
  • Many more models and extensions in development
2841 questions
1
vote
0 answers

Python / statsmodels - out of sample predictions

I am trying to perform autoregressive multiple linear regressions using statsmodels (something like y ~ y_1 + X1 + X2, not ARMA-like). More specifically, I'm looking for a way to get out of sample results. When I use the predict method, I get in…
1
vote
1 answer

Missing observations and clustered standard errors in Python statsmodels?

What's the cleanest, most pythonic way to run a regression only on non-missing data and use clustered standard errors? Imagine I have a Pandas dataframe all_data. Clunky method that works (make a dataframe without missing data): I can make a new…
1
vote
0 answers

ValueError: On entry to DLASCL parameter number 5 had an illegal value

I have been trying to fit a Markov Switching Regime model using epoch timestamps as my x-axis and I keep receiving this error: ValueError: On entry to DLASCL parameter number 5 had an illegal value This error occurs after I fit the model when I try…
guy
  • 1,021
  • 2
  • 16
  • 40
1
vote
0 answers

Using cPickle for multiple regression statsmodel formula in Python Memory Error

I have a multiple linear regression model from the statsmodels and I want to save this model and then use it in a different python script. In looking online it seems that the best way to do this is with cPickle. However, I seem to be getting a…
HM14
  • 689
  • 1
  • 10
  • 30
1
vote
0 answers

x13_arima_analysis: neither x12a nor x13ab available for Windows binaries

Many of those who have used x13_arima_analysis will have seen something like this before now. >>> import statsmodels.api as sm >>> results = sm.tsa.x13_arima_analysis(list(range(23))) Traceback (most recent call last): File "",…
Bill Bell
  • 21,021
  • 5
  • 43
  • 58
1
vote
1 answer

Autoregression Parameter for GEE in Python Statsmodels

I'm trying to run a GEE using an autoregressive structure for some panel data in statsmodels, looking at differences between sales during different hours of a shift: ga = sm.families.Gaussian() ar = sm.cov_struct.Autoregressive() times =…
codercat
  • 13
  • 6
1
vote
0 answers

Multiple linear regression statsmodels do not make sense

I recently moved to python for data analysis and apparently I am stuck on the basics. I am trying to regress the parameters of the following expression: z=20+x+3*y+noise, and I get the right intercept but the parameters are apparently an average of…
famargar
  • 3,258
  • 6
  • 28
  • 44
1
vote
0 answers

Python - Setting constraints on coefficients of dummy variables regression

Is there a way in python where I can add a constrain to my OLS with dummy regression variables? I had a look at this link with a possible solution in R. I am using pandas get_dummies() on my dataframe , without setting the drop_first to…
sumit_uk1
  • 43
  • 1
  • 7
1
vote
1 answer

How to create all possible combinations of formulas using Patsy for model selection?

I am currently using Python's Patsy module to create matrix inputs for my model. For example, a formula I might use is 'Survived ~ C(Pclass) + C(Sex) + C(honor) + C(tix) + Age + SibSp + ParCh + Fare + Embarked + vowel + middle + C(Title)' However,…
Naomi
  • 93
  • 2
  • 9
1
vote
1 answer

Python Vector Error Correction Model

Anyone has an idea on how to model a VECM in python? I can't find it in the statsmodels package.
BigChief
  • 1,413
  • 4
  • 24
  • 37
1
vote
1 answer

Spline smoothening using statsmodel within Python pandas dataframe

I need to do group by smoothening of sales percentage values which could be erratic due to out of stock situations. I have my data in a Pandas dataframe. Here is the code I am trying: from scipy.interpolate import UnivariateSpline s =…
abhiieor
  • 3,132
  • 4
  • 30
  • 47
1
vote
1 answer

Statsmodels API: SARIMAX function missing

I'm trying to follow a tutorial on time series analysis and have hit a hurdle early on. The "SARIMAX" library is unavailable using the following syntax, as per the statsmodels website: import statsmodels.api as sm sm.tsa.statespace.SARIMAX I've…
EB88
  • 841
  • 1
  • 10
  • 26
1
vote
0 answers

Python: using RuntimeWarning to adapt my code to unexpected issues

Dears, I am using Python 3 for performing a large number of regression models on data. I am adding different parameters to the model in loops to test if the model is better than other combinations of parameters. I use statsmodel & logit.fit() I face…
lionrolll
  • 41
  • 6
1
vote
1 answer

Statsmodels gives different ANOVA results to SPSS

I'm getting acquainted with Statsmodels so as to shift my more complicated stats completely over to python. However, I'm being cautious, so I'm cross-checking my results with SPSS, just to make sure I'm not making any obvious blunders. Most of time,…
Lodore66
  • 1,125
  • 4
  • 16
  • 34
1
vote
1 answer

After normalize data, using regression anlaysis how to predict y?

I have Normalize my data and apply regression analysis to predict yield(y). but my predicted output also gives in normalized (in 0 to 1) I want my predicted answer in my correct data numbers,not in 0 to 1. Data: Total_yield(y) Rain(x) …