Questions tagged [linear-regression]

for issues related to linear regression modelling approach

Linear Regression is a formalization of relationships between variables in the form of mathematical equations. It describes how one or more random variables are related to one or more other variables. Here the variables are not deterministically but stochastically related.

Example

Height and age are probabilistically distributed over humans. They are stochastically related; when you know that a person is of age 30, this influences the chance of this person being 4 feet tall. When you know that a person is of age 13, this influences the chance of this person being 6 feet tall.

Model 1

heighti = b0 + b1agei + εi, where b0 is the intercept, b1 is a parameter that age is multiplied by to get a prediction of height, ε is the error term, and i is the subject

Model 2

heighti = b0 + b1agei + b2sexi + εi, where the variable sex is dichotomous

In linear regression, user data X is modelled using linear functions Y, and unknown model parameters W are estimated or learned from the data. E.g., a linear regression model for a k-dimensional user data can be represented as :

Y = w1 x1 + w2 x2 + ... + wk xk

Reading Statistical Modeling: The Two Cultures http://projecteuclid.org/download/pdf_1/euclid.ss/1009213726

In scientific software r for statistical computing and graphics, function lm (see lm) implements linear regression.

6517 questions

votes

2 answers

ValueError: Found input variables with inconsistent numbers of samples: [2750, 1095]

It would be really helpful if someone could help me understand this error and what do I do to fix it? I cannot change my data. X = train[['id', 'listing_type', 'floor', 'latitude', 'longitude', 'beds',…

python-3.x machine-learning scikit-learn linear-regression

asked Jun 25 '18 at 21:03

Rucha

votes

1 answer

How does R handle ordinal predictors in lm()?

As I understand it, when you fit a linear model in R using a nominal predictor, R essentially uses dummy 1/0 variables for each level (except the reference level), and then giving a regular old coefficient for each of these variables. What does it…

r statistics regression linear-regression

asked Jan 30 '17 at 19:18

MissMonicaE

votes

3 answers

Getting 'ValueError: shapes not aligned' on SciKit Linear Regression

Quite new to SciKit and linear algebra/machine learning with Python in general, so I can't seem to solve the following: I have a training set and a test set of data, containing both continuous and discrete/categorical values. The CSV files are…

python pandas machine-learning scikit-learn linear-regression

asked Dec 21 '16 at 20:53

Koen

votes

1 answer

Spark load model and continue training

I'm using Scala with Spark 2.0 to train a model with LinearRegression. val lr = new LinearRegression() .setMaxIter(num_iter) .setRegParam(reg) .setStandardization(true) val model = lr.fit(data) this is working fine and I get good results. I…

scala apache-spark machine-learning linear-regression

asked Sep 01 '16 at 13:01

Silu

votes

1 answer

Multi-variable linear regression with scipy linregress

I'm trying to train a very simple linear regression model. My code is: from scipy import stats xs = [[ 0, 1, 153] [ 1, 2, 0] [ 2, 3, 125] [ 3, 1, 93] [ 2, 24, 5851] [ 3, 1, 524] [ 4, 1, 0] [ 2,…

python scipy linear-regression

asked Jun 23 '16 at 08:12

jbrown

7,518
16
69
117

votes

1 answer

Spark ml and PMML export

I know that it's possible to export models as PMML with Spark-MLlib, but what about Spark-ML? Is it possible to convert LinearRegressionModel from org.apache.spark.ml.regression to a LinearRegressionModel from org.apache.spark.mllib.regression to be…

java apache-spark linear-regression pmml

asked Apr 18 '16 at 08:37

philippe

votes

0 answers

Bayesian error-in-variables (total least squares) model in R using MCMCglmm

I am fitting some Bayesian linear mixed models using the MCMCglmm package in R. My data includes predictors that are measured with error. I'd therefore like to build a model that takes this into account. My understanding is that a basic mixed…

r linear-regression bayesian mixed-models mcmc

asked Dec 24 '15 at 02:46

Alberto

votes

1 answer

Feature mapping using multi-variable polynomial

Consider we have a data-matrix of data points and we are interested to map those data points into a higher dimensional feature space. We can do this by using d-degree polynomials. Thus for a sequence of data points the new data-matrix is I have…

matlab machine-learning mapping octave linear-regression

asked Nov 11 '15 at 22:09

Thoth

votes

2 answers

Normalization in sci-kit learn linear_models

If the normalization parameter is set to True in any of the linear models in sklearn.linear_model, is normalization applied during the score step? For example: from sklearn import linear_model from sklearn.datasets import load_boston a =…

python scikit-learn normalization linear-regression

asked Oct 20 '15 at 20:33

mgoldwasser

14,558
15
79
103

votes

2 answers

Return std and confidence intervals for out-of-sample prediction in StatsModels

I'd like to find the standard deviation and confidence intervals for an out-of-sample prediction from an OLS model. This question is similar to Confidence intervals for model prediction, but with an explicit focus on using out-of-sample data. The…

python linear-regression statsmodels standard-deviation confidence-interval

asked Sep 15 '15 at 18:54

canary_in_the_data_mine

2,193
2
24
28

votes

2 answers

Does scikit-learn perform "real" multivariate regression (multiple dependent variables)?

I would like to predict multiple dependent variables using multiple predictors. If I understood correctly, in principle one could make a bunch of linear regression models that each predict one dependent variable, but if the dependent variables are…

python machine-learning scikit-learn linear-regression multivariate-testing

asked May 26 '15 at 15:38

CSquare

votes

2 answers

Weights with plm package

My data frame looks like something as follows: unique.groups<- letters[1:5] unique_timez<- 1:20 groups<- rep(unique.groups, each=20) my.times<-rep(unique_timez, 5) play.data<- data.frame(groups, my.times, y= rnorm(100), x=rnorm(100), POP= 1:100) I…

r linear-regression panel-data plm

asked Apr 13 '15 at 06:08

Zslice

votes

1 answer

Numpy linear regression with regularization

I'm not seeing what is wrong with my code for regularized linear regression. Unregularized I have simply this, which I'm reasonably certain is correct: import numpy as np def get_model(features, labels): return…

python numpy machine-learning linear-regression

asked Dec 15 '14 at 03:26

Marshall Farrier

votes

2 answers

D3.js linear regression

I searched for some help on building linear regression and found some examples here: nonlinear regression function and also some js libraries that should cover this, but unfortunately I wasn't able to make them work properly: simple-statistics.js…

javascript d3.js linear-regression

asked Dec 10 '13 at 23:28

tomtomtom

1,502
1
18
27

votes

1 answer

Efficient 1D linear regression for each element of 3D numpy array

I have 3D stacks of masked arrays. I'd like to perform a linear regression for values at each row,col (spatial index) along axis 0 (time). The dimensions of these stacks varies, but a typical shape might be (50, 2000, 2000). My spatially-limited…

python numpy linear-regression

asked Dec 03 '13 at 05:22

David Shean

1,015
1
9
11

Prev 1 2 3

…

99 100 Next