Questions tagged [linear-regression]

for issues related to linear regression modelling approach

Linear Regression is a formalization of relationships between variables in the form of mathematical equations. It describes how one or more random variables are related to one or more other variables. Here the variables are not deterministically but stochastically related.

Example

Height and age are probabilistically distributed over humans. They are stochastically related; when you know that a person is of age 30, this influences the chance of this person being 4 feet tall. When you know that a person is of age 13, this influences the chance of this person being 6 feet tall.

Model 1

heighti = b0 + b1agei + εi, where b0 is the intercept, b1 is a parameter that age is multiplied by to get a prediction of height, ε is the error term, and i is the subject

Model 2

heighti = b0 + b1agei + b2sexi + εi, where the variable sex is dichotomous

In linear regression, user data X is modelled using linear functions Y, and unknown model parameters W are estimated or learned from the data. E.g., a linear regression model for a k-dimensional user data can be represented as :

Y = w1 x1 + w2 x2 + ... + wk xk

Reading Statistical Modeling: The Two Cultures http://projecteuclid.org/download/pdf_1/euclid.ss/1009213726

In scientific software for statistical computing and graphics, function lm (see ) implements linear regression.

6517 questions
2
votes
1 answer

Fitting linear model / ANOVA by group

I'm trying to run anova() in R and running into some difficulty. This is what I've done up to now to help shed some light on my question. Here is the str() of my data to this point. str(mhw) 'data.frame': 500 obs. of 5 variables: $ r : int …
pc8807
  • 47
  • 1
  • 6
2
votes
1 answer

lm() on row of data.frame ~ colnames

I try to perform a simple lm() regression analysis on a data frame. Explicitly, I want to perform a regression analyses between the column names of the data frame and each row. My data frame looks like this: d =…
Feliks
  • 154
  • 2
  • 9
2
votes
1 answer

python linear regression implementation

I've been trying to do my own implementation of a simple linear regression algorithm, but I'm having some trouble with the gradient descent. Here's how I coded it: def gradientDescentVector(data, alpha, iterations): a = 0.0 b = 0.0 X =…
msk
  • 131
  • 1
  • 10
2
votes
3 answers

How to best approach the problem of trying to determine the form of an unknown function

I have a set of variables X, Y, ..., Z. My job is to design a function that takes this set of variables and yields an integer. I have a fitness function to test this against. My first stab at the problem is to assume that I can model f to be a…
devoured elysium
  • 101,373
  • 131
  • 340
  • 557
2
votes
1 answer

lmPerm::lmp(y~x*f,center=TRUE) vs lm(y~x*f): very different coefficients

While lmp(y~x, center=TRUE,perm="Prob") lm(y~x) gives a similar result for x and y being quantitative variables, lmp(y~x*f, center=TRUE,perm="Prob") lm(y~x*f) differs where f is a factor variable. require(lmPerm) ## Test data x <-…
user2955884
  • 405
  • 2
  • 11
2
votes
1 answer

SciKit Learn - Mathematical model behind linear regression?

What mathematical model does the Linear Regression function use in scikit learn? The Ordinary Least Squares model has more than one way to minimize the cost function. I've found the form of the function it solves here, but I'm also interested which…
lte__
  • 7,175
  • 25
  • 74
  • 131
2
votes
1 answer

Linear regression with Spark MLlib only returns monotonic predictions

Check the update at the bottom of the question Summary: I have a dataset that does not behave linearly. I am trying to use Spark's MLlib(v1.5.2) to fit a model that behaves more as a polynomial function but I always get a linear model as a result. I…
omrsin
  • 568
  • 10
  • 18
2
votes
1 answer

Trying to plot a simple function - python

I implemented a simple linear regression and I want to try it out by fitting a non linear model specifically I am trying to fit a model for the function y = x^3 + 5 for example this is my code import numpy as np import numpy.matlib import…
Oria Gruber
  • 1,513
  • 2
  • 22
  • 44
2
votes
4 answers

Linear regression using Sklearn prediction not working. data not fit properly

I am trying to perform a linear regression on following data. X = [[ 1 26] [ 2 26] [ 3 26] [ 4 26] [ 5 26] [ 6 26] [ 7 26] [ 8 26] [ 9 26] [10 26] [11 26] [12 26] [13 26] [14 26] [15 26] [16 26] [17 26] [18 26] [19 26] [20 26] …
Jibin Mathew
  • 4,816
  • 4
  • 40
  • 68
2
votes
0 answers

Can some coefficients be held constant during regression training in PySpark?

Is it possible to specify that certain coefficients should be held constant (at a pre-determined value) during the training of a regression model in PySpark? For example, if I have the simple, single-feature data shown below, I can fit a straight…
2
votes
1 answer

Linear regression with tf-idf transformation

I have two dataframes, the former contains > 700 predictors in columns and the latter contains one column. The former is used as predictors (all with values 0 and 1 but mostly 0 because of sparsity) and the second as the response for model training…
yearntolearn
  • 1,064
  • 2
  • 17
  • 36
2
votes
1 answer

Fitting logaritmic curve in a dataset

I have a dataset archivo containing the rates of bonds for every duration of the government auctions since 2003. The first few rows are: Fecha 1 2 3 4 5 6 7 8 9 10 11 12 18 24 2003-01-02 NA NA NA NA NA 44.9999 NA NA 52.0002…
Juan M
  • 119
  • 10
2
votes
1 answer

How to perform multivariable linear regression with scikit-learn?

Forgive my terminology, I'm not an ML pro. I might use the wrong terms below. I'm trying to perform multivariable linear regression. Let's say I'm trying to work out user gender by analysing page views on a web site. For each user whose gender I…
jbrown
  • 7,518
  • 16
  • 69
  • 117
2
votes
2 answers

Java or C equivalent of MATLAB's robustfit

MATLAB has a magnificent robustfit function that solves the problem of excluding outliers with linear regression fitting. Is there anything similar written in Java or C (or in language X that could be adopted)?
Joonas Pulakka
  • 36,252
  • 29
  • 106
  • 169
2
votes
2 answers

Produce nice linear regression plot (fitted line, confidence / prediction bands, etc)

I have this sample 10-year regression in the future. date<-as.Date(c("2015-12-31", "2014-12-31", "2013-12-31", "2012-12-31")) value<-c(16348, 14136, 12733, 10737) #fit linear regression model<-lm(value~date) #build predict…
Oposum
  • 1,155
  • 3
  • 22
  • 38