Questions tagged [statsmodels]

Statsmodels is a Python module that allows users to explore data, estimate statistical models, and perform statistical tests.

Homepage: http://www.statsmodels.org/

An extensive list of descriptive statistics, statistical tests, plotting functions, and result statistics are available for different types of data and each estimator. Features include:

  • Linear regression models
  • Generalized linear models
  • Discrete choice models
  • Robust linear models
  • Many models and functions for time series analysis
  • Nonparametric estimators
  • A collection of datasets for examples
  • A wide range of statistical tests
  • Input-output tools for producing tables in a number of formats (Text, LaTex, HTML) and for reading Stata files into NumPy and Pandas.
  • Plotting functions
  • Extensive unit tests to ensure correctness of results
  • Many more models and extensions in development
2841 questions
17
votes
3 answers

ValueError: endog must be in the unit interval

While using statsmodels, I am getting this weird error: ValueError: endog must be in the unit interval. Can someone give me more information on this error? Google is not helping. Code that produced the error: """ Multiple regression with dummy…
Edward Yu
  • 400
  • 1
  • 4
  • 13
17
votes
3 answers

Package for time series analysis in python

I am working on time series in python. The libraries which I found useful and promising are pandas; statsmodel (for ARIMA); simple exponential smoothing is provided from pandas. Also for visualization: matplotlib Does anyone know a library…
foc
  • 947
  • 1
  • 9
  • 26
17
votes
4 answers

Johansen cointegration test in python

I can't find any reference on funcionality to perform Johansen cointegration test in any Python module dealing with statistics and time series analysis (pandas and statsmodel). Does anybody know if there's some code around that can perform such a…
mspadaccino
  • 382
  • 2
  • 5
  • 17
16
votes
2 answers

Linear regression with dummy/categorical variables

I have a set of data. I have use pandas to convert them in a dummy and categorical variables respectively. So, now I want to know, how to run a multiple linear regression (I am using statsmodels) in Python?. Are there some considerations or maybe I…
16
votes
3 answers

statsmodel AttributeError: module 'scipy.stats' has no attribute 'chisqprob'

I'm running the code below with statsmodel 0.8.0 which i believe is the latest. import statsmodels.api as sm est = sm.Logit(y_train, x_train) result = est.fit() print(result.summary()) This is giving me an error saying: AttributeError: module…
A Rob4
  • 1,278
  • 3
  • 17
  • 35
16
votes
2 answers

How to retrieve model estimates from statsmodels?

From a dataset like this: import pandas as pd import numpy as np import statsmodels.api as sm # A dataframe with two variables np.random.seed(123) rows = 12 rng = pd.date_range('1/1/2017', periods=rows, freq='D') df =…
vestland
  • 55,229
  • 37
  • 187
  • 305
16
votes
1 answer

seasonal decompose in python

I have a CSV file that contains the average temperature over almost 5 years. After decomposition using seasonal_decompose function from statsmodels.tsa.seasonal, I got the following results. Indeed, the results do not show any seasonal! However, I…
16
votes
2 answers

Confidence interval of probability prediction from logistic regression statsmodels

I'm trying to recreate a plot from An Introduction to Statistical Learning and I'm having trouble figuring out how to calculate the confidence interval for a probability prediction. Specifically, I'm trying to recreate the right-hand panel of this…
Taylor
  • 378
  • 2
  • 4
  • 14
16
votes
3 answers

How to ignore statsmodels Maximum Likelihood convergence warning?

I was trying to find the optimal parameter order by using a loop: d = 1 for p in range(3): for q in range(3): try: order = (p, 0, q) params = (p, d, q) arima_mod = ARIMA(ts.dropna(), order).fit(method…
Jellomima
  • 183
  • 1
  • 1
  • 7
16
votes
3 answers

How to get R-squared for robust regression (RLM) in Statsmodels?

When it comes to measuring goodness of fit - R-Squared seems to be a commonly understood (and accepted) measure for "simple" linear models. But in case of statsmodels (as well as other statistical software) RLM does not include R-squared together…
Primer
  • 10,092
  • 5
  • 43
  • 55
16
votes
1 answer

Getting statsmodels to use heteroskedasticity corrected standard errors in coefficient t-tests

I've been digging into the API of statsmodels.regression.linear_model.RegressionResults and have found how to retrieve different flavors of heteroskedasticity corrected standard errors (via properties like HC0_se, etc.) However, I can't quite…
sparc_spread
  • 10,643
  • 11
  • 45
  • 59
16
votes
1 answer

How to fit a model to my testing set in statsmodels (python)

I am working on a logistic regression model and I am having trouble understanding how to take the model fit from my training set onto my testing set. Sorry, I am new to python and VERY new to statsmodels.. import pandas as pd import statsmodels.api…
statsNoob
  • 1,325
  • 5
  • 18
  • 36
15
votes
3 answers

Random Forest Regressor using a custom objective/ loss function (Python/ Sklearn)

I want to build a Random Forest Regressor to model count data (Poisson distribution). The default 'mse' loss function is not suited to this problem. Is there a way to define a custom loss function and pass it to the random forest regressor in Python…
vishmay
  • 386
  • 2
  • 4
  • 15
15
votes
2 answers

Access standardized residuals, cook's values, hatvalues (leverage) etc. easily in Python?

I am looking for influence statistics after fitting a linear regression. In R I can obtain them (e.g.) like this: hatvalues(fitted_model) #hatvalues (leverage) cooks.distance(fitted_model) #Cook's D values rstandard(fitted_model) #standardized…
Jaynes01
  • 521
  • 1
  • 5
  • 20
15
votes
2 answers

What is the proper way to perform Latent Class Analysis in Python?

I'd like to model a data set using Latent Class Analysis (LCA) using Python. I've found the Factor Analysis class in sklearn, but I'm not confident that this class is equivalent to LCA. Does a package or class for LCA exist in Python?
Jessime Kirk
  • 654
  • 1
  • 6
  • 13