Questions tagged [linear-regression]

for issues related to linear regression modelling approach

Linear Regression is a formalization of relationships between variables in the form of mathematical equations. It describes how one or more random variables are related to one or more other variables. Here the variables are not deterministically but stochastically related.

Example

Height and age are probabilistically distributed over humans. They are stochastically related; when you know that a person is of age 30, this influences the chance of this person being 4 feet tall. When you know that a person is of age 13, this influences the chance of this person being 6 feet tall.

Model 1

heighti = b0 + b1agei + εi, where b0 is the intercept, b1 is a parameter that age is multiplied by to get a prediction of height, ε is the error term, and i is the subject

Model 2

heighti = b0 + b1agei + b2sexi + εi, where the variable sex is dichotomous

In linear regression, user data X is modelled using linear functions Y, and unknown model parameters W are estimated or learned from the data. E.g., a linear regression model for a k-dimensional user data can be represented as :

Y = w1 x1 + w2 x2 + ... + wk xk

Reading Statistical Modeling: The Two Cultures http://projecteuclid.org/download/pdf_1/euclid.ss/1009213726

In scientific software r for statistical computing and graphics, function lm (see lm) implements linear regression.

6517 questions

votes

9 answers

How to find the features names of the coefficients using scikit linear regression?

I use scikit linear regression and if I change the order of the features, the coef are still printed in the same order, hence I would like to know the mapping of the feature with the coeff. #training the model model_1_features = ['sqft_living',…

python machine-learning scikit-learn linear-regression

asked Jan 07 '16 at 07:58

amehta

1,307
3
18
22

votes

2 answers

How (and why) do you use contrasts?

Under what cases do you create contrasts in your analysis? How is it done and what is it used for? I checked ?contrasts and ?C - both lead to "Chapter 2 of Statistical Models in S", which is not readily available to me.

r linear-regression categorical-data contrast

asked Feb 28 '10 at 20:51

Tal Galili

24,605
44
129
187

votes

8 answers

ValueError: Expected 2D array, got 1D array instead:

While practicing Simple Linear Regression Model I got this error, I think there is something wrong with my data set. Here is my data set: Here is independent variable X: Here is dependent variable Y: Here is X_train Here Is Y_train This is error…

python-3.x scikit-learn linear-regression

asked Jul 03 '18 at 08:40

danyialKhan

votes

2 answers

Pandas rolling regression: alternatives to looping

I got good use out of pandas' MovingOLS class (source here) within the deprecated stats/ols module. Unfortunately, it was gutted completely with pandas 0.20. The question of how to run rolling OLS regression in an efficient manner has been asked…

python pandas numpy linear-regression statsmodels

asked Jun 06 '17 at 01:31

Brad Solomon

38,521
31
149
235

votes

6 answers

python linear regression predict by date

I want to predict a value at a date in the future with simple linear regression, but I can't due to the date format. This is the dataframe I have: data_df = date value 2016-01-15 1555 2016-01-16 1678 2016-01-17 1789 ... y =…

python date pandas linear-regression

asked Oct 24 '16 at 11:35

jeangelj

4,338
16
54
98

votes

3 answers

How to add interaction term in Python sklearn

If I have independent variables [x1, x2, x3] If I fit linear regression in sklearn it will give me something like this: y = a*x1 + b*x2 + c*x3 + intercept Polynomial regression with poly =2 will give me something like y = a*x1^2 + b*x1*x2…

python scikit-learn regression linear-regression

asked Aug 23 '17 at 00:47

Dylan

votes

2 answers

How does predict.lm() compute confidence interval and prediction interval?

I ran a regression: CopierDataRegression <- lm(V1~V2, data=CopierData1) and my task was to obtain a 90% confidence interval for the mean response given V2=6 and 90% prediction interval when V2=6. I used the following code: X6 <-…

r regression linear-regression prediction lm

asked Jun 29 '16 at 20:30

Mitty

votes

8 answers

Are there any Linear Regression Function in SQL Server?

Are there any Linear Regression Function in SQL Server 2005/2008, similar to the the Linear Regression functions in Oracle ?

sql-server-2005 sql-server-2008 statistics linear-regression

asked Mar 29 '10 at 09:31

rao

1,024
2
11
17

votes

3 answers

OLS Regression: Scikit vs. Statsmodels?

Short version: I was using the scikit LinearRegression on some data, but I'm used to p-values so put the data into the statsmodels OLS, and although the R^2 is about the same the variable coefficients are all different by large amounts. This…

python scikit-learn linear-regression statsmodels

asked Feb 26 '14 at 22:34

Nat Poor

votes

2 answers

lme4::lmer reports "fixed-effect model matrix is rank deficient", do I need a fix and how to?

I am trying to run a mixed-effects model that predicts F2_difference with the rest of the columns as predictors, but I get an error message that says fixed-effect model matrix is rank deficient so dropping 7 columns / coefficients. From this…

r regression linear-regression lme4 mixed-models

asked May 07 '16 at 16:06

Lisa

votes

8 answers

Scikit-Learn Linear Regression how to get coefficient's respective features?

I'm trying to perform feature selection by evaluating my regressions coefficient outputs, and select the features with the highest magnitude coefficients. The problem is, I don't know how to get the respective features, as only coefficients are…

scikit-learn linear-regression feature-selection

asked Nov 15 '14 at 23:14

jeffrey

3,196
7
26
44

votes

3 answers

Python scikit learn Linear Model Parameter Standard Error

I am working with sklearn and specifically the linear_model module. After fitting a simple linear as in import pandas as pd import numpy as np from sklearn import linear_model randn = np.random.randn X = pd.DataFrame(randn(10,3),…

python scikit-learn linear-regression variance

asked Mar 13 '14 at 14:20

Ryan

votes

2 answers

How to plot statsmodels linear regression (OLS) cleanly

Problem Statement: I have some nice data in a pandas dataframe. I'd like to run simple linear regression on it: Using statsmodels, I perform my regression. Now, how do I get my plot? I've tried statsmodels' plot_fit method, but the plot is a little…

python pandas matplotlib linear-regression statsmodels

asked Feb 15 '17 at 23:20

Alex Lenail

12,992
10
47
79

votes

1 answer

Linear Regression and Gradient Descent in Scikit learn?

In this Coursera course for machine learning, it says gradient descent should converge. I'm using Linear regression from scikit learn. It doesn't provide gradient descent info. I have seen many questions on StackOverflow to implement linear…

python machine-learning scikit-learn linear-regression

asked Dec 26 '15 at 06:57

Netro

7,119
6
40
58

votes

3 answers

Why is numpy.linalg.pinv() preferred over numpy.linalg.inv() for creating inverse of a matrix in linear regression

If we want to search for the optimal parameters theta for a linear regression model by using the normal equation with: theta = inv(X^T * X) * X^T * y one step is to calculate inv(X^T*X). Therefore numpy provides np.linalg.inv() and…

python numpy matrix linear-algebra linear-regression

asked Mar 19 '18 at 07:04

2Obe

3,570
6
30
54

Prev 1 2

…

99 100 Next