Questions tagged [statsmodels]

Statsmodels is a Python module that allows users to explore data, estimate statistical models, and perform statistical tests.

Homepage: http://www.statsmodels.org/

An extensive list of descriptive statistics, statistical tests, plotting functions, and result statistics are available for different types of data and each estimator. Features include:

  • Linear regression models
  • Generalized linear models
  • Discrete choice models
  • Robust linear models
  • Many models and functions for time series analysis
  • Nonparametric estimators
  • A collection of datasets for examples
  • A wide range of statistical tests
  • Input-output tools for producing tables in a number of formats (Text, LaTex, HTML) and for reading Stata files into NumPy and Pandas.
  • Plotting functions
  • Extensive unit tests to ensure correctness of results
  • Many more models and extensions in development
2841 questions
15
votes
1 answer

Extracting coefficients from GLM in Python using statsmodel

I have a model which is defined as follows: import statsmodels.formula.api as smf model = smf.glm(formula="A ~ B + C + D", data=data, family=sm.families.Poisson()).fit() The model has coefficients which look like so: Intercept 0.319813 C[T.foo] …
user2844485
  • 1,112
  • 3
  • 15
  • 26
15
votes
1 answer

scipy.stats.expon.fit() with no location parameter

I am using scipy.stats.expon.fit(data) to fit an exponential distribution to my data. This appears to return two values where I would expect one. The documentation online doesn't seem to say what fit() returns but looking at the source, I am…
Simd
  • 19,447
  • 42
  • 136
  • 271
15
votes
1 answer

python 3 + statsmodels?

If I do sudo pip3 install statsmodels I get errors. I pasted the end of the console output below. I see a numpy 1.7 warning, yet if I do pip3 freeze | grep numpy, I see that I'm using numpy==1.8.1. Here is the output. any…
user2684301
  • 2,550
  • 1
  • 24
  • 33
15
votes
2 answers

ARMA out-of-sample prediction with statsmodels

I'm using statsmodels to fit a ARMA model. import statsmodels.api as sm arma = sm.tsa.ARMA(data, order =(4,4)); results = arma.fit( full_output=False, disp=0); Where data is a one-dimensional array. I know to get in-sample predictions: pred =…
sirip82
  • 187
  • 1
  • 1
  • 8
15
votes
2 answers

Predicting values using an OLS model with statsmodels

I calculated a model using OLS (multiple linear regression). I divided my data to train and test (half each), and then I would like to predict values for the 2nd half of the labels. model = OLS(labels[:half], data[:half]) predictions =…
nickb
  • 882
  • 3
  • 8
  • 22
14
votes
2 answers

Is LASSO regression implemented in Statsmodels?

I would love to use a linear LASSO regression within statsmodels, so to be able to use the 'formula' notation for writing the model, that would save me quite some coding time when working with many categorical variables, and their interactions.…
famargar
  • 3,258
  • 6
  • 28
  • 44
14
votes
0 answers

ValueError: You must specify a freq or x must be a pandas object with a timeseries index

I have obtained time-series data this way: from pandas.io.data import DataReader from datetime import datetime ts_log = DataReader('RUB=X', 'yahoo', datetime(2007,1,1), datetime(2016,8,30))["Adj Close"] ts_log looks this way: Date 2007-01-01 …
Rocketq
  • 5,423
  • 23
  • 75
  • 126
14
votes
2 answers

How to calculate the 99% confidence interval for the slope in a linear regression model in python?

We have following linear regression: y ~ b0 + b1 * x1 + b2 * x2. I know that regress function in Matlab does calculate it, but numpy's linalg.lstsq doesn't (https://docs.scipy.org/doc/numpy-dev/user/numpy-for-matlab-users.html).
user2558053
  • 435
  • 2
  • 6
  • 12
14
votes
1 answer

statsmodels ARIMA.fit: Hide output

It seems whenever I run ARIMA.fit(), I always get a stdout from the kalman filter: ## -- End pasted text -- RUNNING THE L-BFGS-B CODE * * * Machine precision = 2.220D-16 N = 1 M = 12 This problem is…
hlin117
  • 20,764
  • 31
  • 72
  • 93
14
votes
2 answers

Difference(s) between scipy.stats.linregress, numpy.polynomial.polynomial.polyfit and statsmodels.api.OLS

It seems all three functions can do simple linear regression, e.g. scipy.stats.linregress(x, y) numpy.polynomial.polynomial.polyfit(x, y, 1) x = statsmodels.api.add_constant(x) statsmodels.api.OLS(y, x) I wonder if there is any real difference…
MLister
  • 10,022
  • 18
  • 64
  • 92
14
votes
1 answer

Statsmodels version 0.6.1 does not include tsa?

I'm trying to get the HP-filter working using statsmodels (sm). The documentation here implies that the module sm.tsa already exists for 0.6.1, but I get the following error: >>> import statsmodels as sm >>> sm.__version__ '0.6.1' >>>…
FooBar
  • 15,724
  • 19
  • 82
  • 171
13
votes
3 answers

Using categorical variables in statsmodels OLS class

I want to use statsmodels OLS class to create a multiple regression model. Consider the following dataset: import statsmodels.api as sm import pandas as pd import numpy as np dict = {'industry': ['mining', 'transportation', 'hospitality',…
Todd Shannon
  • 527
  • 1
  • 6
  • 20
13
votes
2 answers

Shape not aligned error in OLS Regression python

I have a dataframe where I am trying to run the statsmodel.api OLS regression. It is printing out the summary. But when I am using the predict() function, it is giving me an error - shapes (75,7) and (6,) not aligned: 7 (dim 1) != 6 (dim 0) My…
Trisa Biswas
  • 555
  • 1
  • 3
  • 17
13
votes
1 answer

Cannot plot predicted time series values using matplotlib

I am trying to plot my actual time series values and predicted values but it gives me this error: ValueError: view limit minimum -36816.95989583333 is less than 1 and is an invalid Matplotlib date value. This often happens if you pass a…
Joey12
  • 177
  • 1
  • 10
13
votes
1 answer

Why `sklearn` and `statsmodels` implementation of OLS regression give different R^2?

Accidentally I have noticed, that OLS models implemented by sklearn and statsmodels yield different values of R^2 when not fitting intercept. Otherwise they seems to work fine. The following code yields: import numpy as np import sklearn import…
abukaj
  • 2,582
  • 1
  • 22
  • 45