Regression with coefficients' significance check

Question

Can someone refer me to the linear regression package, which would not only run the regression but also would calculate the significance criteria (std / mean) for each of the regression coefficients and compare them to the appropriate p-value with (N-k) "degree of freedom"? Or at least would provide the output, which can be used to calculate such?

Ideally, for Python but will take R as well

Thank you!

Have you looked at `?lm`/`?summary.lm` in base R? (Technically it's in the `stats` package, but that's auto-loaded when you run R) — Ben Bolker, Sep 15 '15 at 19:02
I would recommend Googling "python linear regression example", in which the [first hit](http://www.dataschool.io/linear-regression-in-python/) takes you to [this iPython notebook](http://nbviewer.ipython.org/github/justmarkham/DAT4/blob/master/notebooks/08_linear_regression.ipynb), which provides a detailed walkthrough of linear regression in Python. [An Introduction to Statistical Learning](http://www-bcf.usc.edu/~gareth/ISL/) is a great R resource. — Tchotchke, Sep 15 '15 at 19:11
@Tchotchke - great link! thank you! using the terminology there, I am looking for a model "continuous, unsupervised" which would provide for "dimension reduction" using coefficient significance criteria. The issue I see that not all statistics is conveniently (or even at all) available when pre-built packages are used. Even worse, it is very hard or even impossible to get the data which can be used to expand the hypothesis testing — Toly, Sep 15 '15 at 19:25

score 1 · Answer 1 · answered Sep 15 '15 at 19:04

1

In R, lm() will fit linear models and summary() gives full output including coefficient estimates, standard errors, t-statistics, and p-values. https://stat.ethz.ch/R-manual/R-patched/library/stats/html/lm.html

answered Sep 15 '15 at 19:04

Calvin

1,309
1
14
25

Awesome! Thank you! Nothing in Python (e.g. scipy)? – Toly Sep 15 '15 at 19:16
I am not as familiar with Python functions. – Calvin Sep 15 '15 at 19:29

score 1 · Answer 2 · answered Sep 15 '15 at 22:37

statsmodels provides all the standard inference for linear regression and other estimation models.

The output below is copied from this notebook http://statsmodels.sourceforge.net/stable/examples/notebooks/generated/formulas.html

A blog with some explanations:

http://www.datarobot.com/blog/multiple-regression-using-statsmodels/

mod = ols(formula='Lottery ~ Literacy + Wealth + Region', data=df)
res = mod.fit()
print(res.summary())
                            OLS Regression Results
==============================================================================
Dep. Variable:                Lottery   R-squared:                       0.338
Model:                            OLS   Adj. R-squared:                  0.287
Method:                 Least Squares   F-statistic:                     6.636
Date:                Tue, 02 Dec 2014   Prob (F-statistic):           1.07e-05
Time:                        12:52:16   Log-Likelihood:                -375.30
No. Observations:                  85   AIC:                             764.6
Df Residuals:                      78   BIC:                             781.7
Df Model:                           6
Covariance Type:            nonrobust
===============================================================================
                  coef    std err          t      P>|t|      [95.0% Conf. Int.]
-------------------------------------------------------------------------------
Intercept      38.6517      9.456      4.087      0.000        19.826    57.478
Region[T.E]   -15.4278      9.727     -1.586      0.117       -34.793     3.938
Region[T.N]   -10.0170      9.260     -1.082      0.283       -28.453     8.419
Region[T.S]    -4.5483      7.279     -0.625      0.534       -19.039     9.943
Region[T.W]   -10.0913      7.196     -1.402      0.165       -24.418     4.235
Literacy       -0.1858      0.210     -0.886      0.378        -0.603     0.232
Wealth          0.4515      0.103      4.390      0.000         0.247     0.656
==============================================================================
Omnibus:                        3.049   Durbin-Watson:                   1.785
Prob(Omnibus):                  0.218   Jarque-Bera (JB):                2.694
Skew:                          -0.340   Prob(JB):                        0.260
Kurtosis:                       2.454   Cond. No.                         371.
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

Regression with coefficients' significance check

2 Answers2