Questions tagged [statsmodels]

Statsmodels is a Python module that allows users to explore data, estimate statistical models, and perform statistical tests.

Homepage: http://www.statsmodels.org/

An extensive list of descriptive statistics, statistical tests, plotting functions, and result statistics are available for different types of data and each estimator. Features include:

  • Linear regression models
  • Generalized linear models
  • Discrete choice models
  • Robust linear models
  • Many models and functions for time series analysis
  • Nonparametric estimators
  • A collection of datasets for examples
  • A wide range of statistical tests
  • Input-output tools for producing tables in a number of formats (Text, LaTex, HTML) and for reading Stata files into NumPy and Pandas.
  • Plotting functions
  • Extensive unit tests to ensure correctness of results
  • Many more models and extensions in development
2841 questions
21
votes
3 answers

logit regression and singular Matrix error in Python

am trying to run logit regression for german credit data (www4.stat.ncsu.edu/~boos/var.select/german.credit.html). To test the code, I have used only numerical variables and tried regressing it with the result using the following code. import pandas…
user3122731
  • 211
  • 1
  • 2
  • 4
20
votes
2 answers

Holt-Winters time series forecasting with statsmodels

I tried forecasting with holt-winters model as shown below but I keep getting a prediction that is not consistent with what I expect. I also showed a visualization of the plot Train = Airline[:130] Test = Airline[129:] from…
Mujeebla
  • 203
  • 1
  • 2
  • 6
20
votes
1 answer

ValueWarning: No frequency information was provided, so inferred frequency MS will be used

I try to fit Autoregression by sm.tsa.statespace.SARIMAX. But I meet a warning, then I want to set frequency information for this model. Who used to meet it, can you help me ? fit1 = sm.tsa.statespace.SARIMAX(train.Demand, order=(1, 0, 0), …
Lê Ngọc Thạch
  • 201
  • 1
  • 2
  • 5
20
votes
2 answers

Statsmodels ARIMA - Different results using predict() and forecast()

I use ARIMA from statsmodels package in order to predict values from a series: plt.plot(ind, final_results.predict(start=0 ,end=26)) plt.plot(ind, forecast.values) plt.show() I thought that I would get the same results from these two methods, but…
Simone
  • 4,800
  • 12
  • 30
  • 46
19
votes
3 answers

How to perform a chi-squared goodness of fit test using scientific libraries in Python?

Let's assume I have some data I obtained empirically: from scipy import stats size = 10000 x = 10 * stats.expon.rvs(size=size) + 0.2 * np.random.uniform(size=size) It is exponentially distributed (with some noise) and I want to verify this using a…
metakermit
  • 21,267
  • 15
  • 86
  • 95
19
votes
3 answers

Specifying which category to treat as the base with 'statsmodels'

In understand that when I have a category variable in a model passed to a statsmodels fit that dummy variables will automatically be generated for the categories. For example if I have a variable 'Location' with values 'IndianOcean', 'Thailand',…
orome
  • 45,163
  • 57
  • 202
  • 418
18
votes
2 answers

Ignoring missing values in multiple OLS regression with statsmodels

I'm trying to run a multiple OLS regression using statsmodels and a pandas dataframe. There are missing values in different columns for different rows, and I keep getting the error message: ValueError: array must not contain infs or NaNs I saw this…
user2649353
  • 367
  • 2
  • 3
  • 9
17
votes
6 answers

AttributeError: module 'statsmodels.formula.api' has no attribute 'OLS'

I am trying to use Ordinary Least Squares for multivariable regression. But it says that there is no attribute 'OLS' from statsmodels. formula. api library. I am following the code from a lecture on Udemy The code is as follows: import…
17
votes
3 answers

python 3.5 in statsmodels ImportError: cannot import name '_representation'

I cannot manage to import statsmodels.api correctly when i do that I have this error: File "/home/mlv/.local/lib/python3.5/site-packages/statsmodels/tsa/statespace/tools.py", line 59, in set_mode from . import (_representation,…
Jérémy
  • 340
  • 1
  • 3
  • 13
17
votes
3 answers

How to interpret adfuller test results?

I am struggling to understand the concept of p-value and the various other results of adfuller test. The code I am using: (I found this code in Stack Overflow) import numpy as np import os import pandas as pd import statsmodels.api as sm import…
Sid
  • 3,749
  • 7
  • 29
  • 62
17
votes
4 answers

Deprecated rolling window option in OLS from Pandas to Statsmodels

as the title suggests, where has the rolling function option in the ols command in Pandas migrated to in statsmodels? I can't seem to find it. Pandas tells me doom is in the works: FutureWarning: The pandas.stats.ols module is deprecated and will be…
Asher11
  • 1,295
  • 2
  • 15
  • 31
17
votes
1 answer

add_constant() in statsmodels not working

I try to use the add_constant() function with an array of dataset. At index 59 it works (the column is created) but at index 60 it isn't created. Initially, testmat[59] returns a shape of (24, 54) and testmat[60] a shape of (9, 54). Hereafter is…
florian
  • 881
  • 2
  • 8
  • 24
17
votes
3 answers

Python: How to evaluate the residuals in StatsModels?

I want to evaluate the residuals: (y-hat y). I know how to do that: df = pd.read_csv('myFile', delim_whitespace = True, header = None) df.columns = ['column1', 'column2'] y, X = ps.dmatrices('column1 ~ column2',data = df, return_type =…
DanielTheRocketMan
  • 3,199
  • 5
  • 36
  • 65
17
votes
1 answer

Time Series Analysis - unevenly spaced measures - pandas + statsmodels

I have two numpy arrays light_points and time_points and would like to use some time series analysis methods on those data. I then tried this : import statsmodels.api as sm import pandas as pd tdf = pd.DataFrame({'time':time_points[:]}) rdf = …
Robin
  • 605
  • 2
  • 8
  • 25
17
votes
1 answer

Analysing Time Series in Python - pandas formatting error - statsmodels

I am trying to analyse stars' data. I have light time series of the stars and I want to predict to which class (among 4 different types) they belong. I have light time series of those stars, and I want to analyse those time series by doing…
Robin
  • 605
  • 2
  • 8
  • 25