Questions tagged [linear-regression]

for issues related to linear regression modelling approach

Linear Regression is a formalization of relationships between variables in the form of mathematical equations. It describes how one or more random variables are related to one or more other variables. Here the variables are not deterministically but stochastically related.

Example

Height and age are probabilistically distributed over humans. They are stochastically related; when you know that a person is of age 30, this influences the chance of this person being 4 feet tall. When you know that a person is of age 13, this influences the chance of this person being 6 feet tall.

Model 1

heighti = b0 + b1agei + εi, where b0 is the intercept, b1 is a parameter that age is multiplied by to get a prediction of height, ε is the error term, and i is the subject

Model 2

heighti = b0 + b1agei + b2sexi + εi, where the variable sex is dichotomous

In linear regression, user data X is modelled using linear functions Y, and unknown model parameters W are estimated or learned from the data. E.g., a linear regression model for a k-dimensional user data can be represented as :

Y = w1 x1 + w2 x2 + ... + wk xk

Reading Statistical Modeling: The Two Cultures http://projecteuclid.org/download/pdf_1/euclid.ss/1009213726

In scientific software for statistical computing and graphics, function lm (see ) implements linear regression.

6517 questions
2
votes
0 answers

R - Fitting a constrained AutoRegression time series

I have a time-series which I need to fit onto an AR (auto-regression) model. The AR model has the form: x(t) = a0 + a1*x(t-1) + a2*x(t-2) + ... + aq*x(t-q) + noise. I have two contraints: Find the best AR fit when lag.max = 50. Sum of all…
Paul
  • 73
  • 6
2
votes
1 answer

MATLAB: Piecewise function in curve fitting toolbox using fittype

Ignore the red fitted curve first. I'd like to get a curve to the blue datapoints. I know the first part (up to y~200 in this case) is linear, then a different curve (combination of two logarithmic curves but could also be approximated differently)…
tim
  • 9,896
  • 20
  • 81
  • 137
2
votes
4 answers

Java Non-negative multiple linear regression library

I am working on a Java project, and I have to compute a multiple linear regression, but I want the gotten parameters to be non-negative. Is there an existing commercial-friendly-licensed library to do such a thing? I've been looking for Non-Negative…
Cedric Buron
  • 95
  • 1
  • 10
2
votes
1 answer

How to choose Gaussian basis functions hyperparameters for linear regression?

I'm quite new in machine learning environment, and I'm trying to understand properly some basis concept. My problem is the following: I have a set of data observation and the corresponding target values {x,t}. I'm trying to train a function with…
2
votes
1 answer

Fitting downward trends (negative slope) with statsmodels linear regression

I can't get linear regression in python StatsModels to fit a data series with a negative slope - neither RLM nor OLS are working for me. Take a very simple case where I'd expect a slope of -1: In [706]: ts12 =…
Reed Sandberg
  • 671
  • 1
  • 10
  • 18
2
votes
2 answers

Selecting columns in a data.frame to implement in a model

Is there a way to run a model (for simplicity, a linear model) using specified columns of a data.frame? For example, I would like to be able to do something like this: set.seed(1) ET = runif(10, 1,20) x1 = runif(10, 1,20) x2 = runif(10, 1,30) x3 =…
Sarah
  • 731
  • 7
  • 15
2
votes
1 answer

Matlab plot regression function

I'm plotting a linear regression using the MATLAB function plotregression in this way: hand = plotregression(x, y, 'Regression') However, I'd like to get rid of the y = T line in the plot, and also use a different marker, such as *. How can I do…
Paula
  • 21
  • 1
  • 4
2
votes
2 answers

Re-transform a linear model. Case study with R

Let's say I have a response variable which is not normally distributed and an explanatory variable. Let's create these two variables first (coded in R): set.seed(12) resp = (rnorm(120)+20)^3.79 expl = rep(c(1,2,3,4),30) I run a linear model and I…
Remi.b
  • 17,389
  • 28
  • 87
  • 168
2
votes
1 answer

How do I apply scikit-learn's LogisticRegression for some decimal data?

I've the training data set like this: 0.00479616 | 0.0119904 | 0.00483092 | 0.0120773 | 1 0.51213136 | 0.0113404 | 0.02383092 | -0.012073 | 0 0.10479096 | -0.011704 | -0.0453692 | 0.0350773 | 0 The first 4 columns is features of one sample…
WoooHaaaa
  • 19,732
  • 32
  • 90
  • 138
2
votes
1 answer

Multiple linear regression python

I use multiple linear regression, I have one dependant variable (var) and several independant variables (varM1, varM2,...) I use this code in python: z=array([varM1, varM2, varM3],int32) n=max(shape(var)) X = vstack([np.ones(n), z]).T a =…
user2050187
  • 175
  • 2
  • 5
  • 14
2
votes
2 answers

Applying fixed effects factor in R breaks the regression

I am trying to run a fixed effects regression in R. When I run the linear model without the fixed effects factor being applied the model works just fine. But when I apply the factor - which is a numeric code for user ID, I get the following error:…
vijkrishb
  • 23
  • 2
2
votes
1 answer

Apply regression coefficients that have one answer per factor to many entries per factor in a dataframe in R

I have a dataframe that has a column for time, symbol, price, volatility. I use this dataframe to run a first pass OLS regression using dummy variables for the symbol fit <- lm(volatility~factor(symbol) + 0 Then I want to use the coefficients from…
samooch
  • 81
  • 1
  • 1
  • 5
2
votes
1 answer

Linear model: comparing predictive power of two different measurement methods

I'm interested in predicting Y and am studying different two measurement techniques X1 and X2. It could be for instance that I want to predict the tastiness of a banana, either by measuring how long it has been lying on the table, or by measuring…
Marijn van Vliet
  • 5,239
  • 2
  • 33
  • 45
2
votes
0 answers

R: saving lm object with least amount of space while still maintaining functionality of the predict function

I have developed a linear regression model using lm. My main purpose is to predict a prediction interval using the function predict. As it stands right now, the lm object is too big for my taste. As a result, I was wondering if there was any way…
2
votes
2 answers

Efficient way to do a rolling linear regression

I have two vectors x and y, and I want to compute a rolling regression for those, e.g a on (x(1:4),y(1:4)), (x(2:5),y(2:5)), ... Is there already a function for that? The best algorithm I have in mind for this is O(n), but applying separate linear…
1 2 3
99
100