Questions tagged [linear-regression]

for issues related to linear regression modelling approach

Linear Regression is a formalization of relationships between variables in the form of mathematical equations. It describes how one or more random variables are related to one or more other variables. Here the variables are not deterministically but stochastically related.

Example

Height and age are probabilistically distributed over humans. They are stochastically related; when you know that a person is of age 30, this influences the chance of this person being 4 feet tall. When you know that a person is of age 13, this influences the chance of this person being 6 feet tall.

Model 1

heighti = b0 + b1agei + εi, where b0 is the intercept, b1 is a parameter that age is multiplied by to get a prediction of height, ε is the error term, and i is the subject

Model 2

heighti = b0 + b1agei + b2sexi + εi, where the variable sex is dichotomous

In linear regression, user data X is modelled using linear functions Y, and unknown model parameters W are estimated or learned from the data. E.g., a linear regression model for a k-dimensional user data can be represented as :

Y = w1 x1 + w2 x2 + ... + wk xk

Reading Statistical Modeling: The Two Cultures http://projecteuclid.org/download/pdf_1/euclid.ss/1009213726

In scientific software for statistical computing and graphics, function lm (see ) implements linear regression.

6517 questions
2
votes
3 answers

Rolling regression on irregular time series

Summary (tldr) I need to perform a rolling regression on an irregular time series (i.e. the interval may not even be periodic and go from 0, 1, 2, 3... to ...7, 20, 24, 28...) that's simple numeric and does not necessarily require date/time, but the…
omgCat
  • 83
  • 6
2
votes
2 answers

Looking for a sample how to do weighted linear regression

I'm trying to use MathNet to calculate weighted linear regression of my data. The documentation is here. I'm trying to find a x + b = y such that it would best fit a list of (x,y,w), where w is weight of each point. var r =…
Arsen Zahray
  • 24,367
  • 48
  • 131
  • 224
2
votes
1 answer

simple linear regression by tensorflow

I am a beginner in tensorflow and machine learning. I want to try a simple linear regression example by tensorflow. But the loss can't decrease after 3700 epoch. I don't know what's wrong? Obviously, we got the W = 3.52, b = 2.8865. So y = 3.52*x +…
Lee
  • 383
  • 4
  • 13
2
votes
4 answers

Normal Equation Implementation in Python / Numpy

I've written some beginner code to calculate the co-efficients of a simple linear model using the normal equation. # Modules import numpy as np # Loading data set X, y = np.loadtxt('ex1data3.txt', delimiter=',', unpack=True) data =…
PS94
  • 33
  • 1
  • 1
  • 6
2
votes
1 answer

Significant value change after remove intercept in Linear model

I have implemented a linear regression with intercept and without an intercept: TotalReview ~ Number_of_files + LOC With intercept, I get the below output where Number_of_files variable is significant: Coefficients: Estimate Std.…
2
votes
1 answer

Extract estimated coefficients and the standard error using tidyverse

I have a dataframe like so: set.seed(560) df<-data.frame(lag= rep(1:40, each=228), psit= rep(rnorm(228, 20, 10)),var=rnorm(9120, 50, 10)) For each subset of lag I would like to run a linear regression where psit is predicted by var (e.g.…
Danielle
  • 785
  • 7
  • 15
2
votes
1 answer

Finding linear regression bearing from set of coordinates isn't giving desired result

I have a series of coordinates that I'm wanting to find the linear regression for, so I can then find the bearing of the line. I'm using the Linear Regression algorithm from Swift Algorithm Club on this set of coordinates: 51.48163827836369,…
Andrew
  • 7,693
  • 11
  • 43
  • 81
2
votes
2 answers

rstudent() returns incorrect result for an "mlm" (linear models fitted with multiple LHS)

I know that the support for linear models with multiple LHS is limited. But when it is possible to run a function on an "mlm" object, I would expect the results to be trusty. When using rstudent, strange results are produced. Is this a bug or is…
2
votes
1 answer

How to examine the feature weights of a Tensorflow LinearClassifier?

I am trying to understand the Large-scale Linear Models with TensorFlow documentation. The docs motivate these models as follows: Linear model can be interpreted and debugged more easily than neural nets. You can examine the weights assigned to…
TemplateRex
  • 69,038
  • 19
  • 164
  • 304
2
votes
1 answer

R's lm function in Pandas

I have the following lm function in R: in_data <- c(0.5, 0.6, 0.7) minutes <- c(30, 60, 90) foobar <- lm(log(in_data) ~ 0 + hours) Questions I understand the ~ operator is used to separate the left- and right-hand sides in a model formula. So in…
Craig
  • 1,929
  • 5
  • 30
  • 51
2
votes
0 answers

Optimization is doing the oposite in a simple tensorflow example

I have been trying to learn TensorFlow, I'm trying to modify a simple linear regression example and transform it into a polynomial regression by adding 2 more variables. It shouldn't be so hard but somehow when I execute the optimizer instead of…
Diego Orellana
  • 994
  • 1
  • 9
  • 20
2
votes
1 answer

How to define formula dynamically from select input ( multiple = TRUE) in R shiny

I am trying to define a formula for multinomial logistic regression , it should take the input from drop down list upto maximum 6 Independent variables. ( SelectInput , Multiple = TRUE) in R Shiny. Not able to figure out how to resolve this .. Here…
nab
  • 23
  • 5
2
votes
2 answers

Scikit learn: measure of goodness of fit, better splitting the dataset or use all of it?

Sort of taking inspiration from here. My problem So I have a dataset with 3 features and n observations. I also have n responses. Basically I want to see if this model is a good fit or not. From the question above people use R^2 for this purpose.…
Euler_Salter
  • 3,271
  • 8
  • 33
  • 74
2
votes
3 answers

Equations for 2 variable Linear Regression

We are using a programming language that does not have a linear regression function in it. We have already implemented a single variable linear equation: y = Ax + B and have simply calculated the A and B coefficents from the data using a solution…
lkessler
  • 19,819
  • 36
  • 132
  • 203
2
votes
1 answer

caret: Error in table(y) : attempt to make a table with >= 2^31 elements

I am having some trouble with the caret package. I am new to R and I am trying to make a multiple linear regression model. I need to split my data into the testing and training set. I tried to use the caret package createDataPartition, but I get an…
sprc
  • 21
  • 1
  • 3