Questions tagged [linear-regression]

for issues related to linear regression modelling approach

Linear Regression is a formalization of relationships between variables in the form of mathematical equations. It describes how one or more random variables are related to one or more other variables. Here the variables are not deterministically but stochastically related.

Example

Height and age are probabilistically distributed over humans. They are stochastically related; when you know that a person is of age 30, this influences the chance of this person being 4 feet tall. When you know that a person is of age 13, this influences the chance of this person being 6 feet tall.

Model 1

heighti = b0 + b1agei + εi, where b0 is the intercept, b1 is a parameter that age is multiplied by to get a prediction of height, ε is the error term, and i is the subject

Model 2

heighti = b0 + b1agei + b2sexi + εi, where the variable sex is dichotomous

In linear regression, user data X is modelled using linear functions Y, and unknown model parameters W are estimated or learned from the data. E.g., a linear regression model for a k-dimensional user data can be represented as :

Y = w1 x1 + w2 x2 + ... + wk xk

Reading Statistical Modeling: The Two Cultures http://projecteuclid.org/download/pdf_1/euclid.ss/1009213726

In scientific software for statistical computing and graphics, function lm (see ) implements linear regression.

6517 questions
2
votes
0 answers

How do I determine the weight to assign to each bucket?

Someone will answer a series of questions and will mark each important (I), very important (V), or extremely important (E). I'll then match their answers with answers given by everyone else, compute the percent of the answers in each bucket that are…
2
votes
1 answer

writing a wrapper for a linear modeling function [MASS::lm.gls()]

The function MASS::lm.gls fits a linear model using generalized least squares, and returns an object of class "lm.gls", but is has no print, summary or other methods. I could define these simply by hijacking the methods for "lm" objects print.lm.gls…
user101089
  • 3,756
  • 1
  • 26
  • 53
2
votes
2 answers

P values from fastbw regression function of rms package

I am trying fastbw function of rms package for backward regression as follows (using mtcars dataset): > mod = ols(mpg~am+vs+cyl+drat+wt+gear, mtcars) > mod Linear Regression Model ols(formula = mpg ~ am + vs + cyl + drat + wt + gear, data =…
rnso
  • 23,686
  • 25
  • 112
  • 234
2
votes
2 answers

How do you know if a data set is right for linear regression if it has multiple features?

If it has one feature it's easy. Just graph it. One of the records there looks like (18, 15). Simple. But if we have multiple features that adds more dimensions to the graph, right? So how can you visualize your data set and determine whether or…
user2102611
2
votes
2 answers

Spark - create RDD of (label, features) pairs from CSV file

I have a CSV file and want to perform a simple LinearRegressionWithSGD on the data. A sample data is as follow (the total rows in the file is 99 including labels) and the objective is to predict the y_3…
Mohammad
  • 1,006
  • 2
  • 15
  • 29
2
votes
2 answers

need finite 'xlim' values using reactive function in Shiny

I'm trying to build a Linear regression Shiny app with a custom file input. I have a problem with the reactive function in Server.R. The reactive function data returns a data frame called qvdata. When data() is called in renderPlot and I plot from…
Sölvi
  • 500
  • 5
  • 17
2
votes
1 answer

Performing math on a Python Pandas Group By DataFrame

I have a Pandas DataFrame with the following structure: In [1]: df Out[1]: location_code month amount 0 1 1 10 1 1 2 11 2 1 3 12 3 1 4 13 4 …
invoker
  • 507
  • 3
  • 7
  • 18
2
votes
2 answers

R Prediction on a Linear Regression Model

I'm sure this is something that can be done, just not sure how! I have a dataset that is around 500 rows(csv) and it shows footballers match stas(e,g passes, shots on target)etc.I have some of their salaries(around 10) and I'n trying to predict…
2
votes
1 answer

Adding Interaction Terms to MATLAB Multiple Regression

I am currently running a multiple linear regression using MATLAB's LinearModel.fit function, and I am bit confused in regards to how to properly add interaction terms to the model by hand. As I am aware, LinearModel.fit does not standardize…
dwm8
  • 309
  • 3
  • 16
2
votes
1 answer

How to make a for loop to find interactions between several variables in R?

I have a data set with 17 variables the data is available at this link http://www.uwyo.edu/crawford/stat3050/final%20project/maxwellchandler.txt I want to find significant interactions between the variables. For example …
Maxwell Chandler
  • 626
  • 8
  • 18
2
votes
1 answer

Calculate 'R Square' and 'P-Value' for multiple linear regression in TSQL

We just have few built-in functions in SQL Server to do sophisticated statistical analysis but I need to calculate multiple linear regression in TSQL. Based on this post (Multiple Linear Regression function in SQL Server), I could be able to get…
sqluser
  • 5,502
  • 7
  • 36
  • 50
2
votes
2 answers

create x with first order autoregressive process in an OLS

I have a simple regression: yt=β1+β2xi+ei, with n=27, and "x" an AR(1): xi = c + ∅x(i-1) + ηi , where ηi~N(0,1) , x0~N(c/(1-∅),1/(1-∅^2) , c=2 , ∅=0.6 I need to create "x", for this I have set everything including the "x0", however I am…
2
votes
3 answers

linear regression with Quarter dummy

I am trying to fit a linear regression to the data below Power<-mutate(Power,Year=format(Date,"%Y"),Quarter=quarters(Date),Month=format(Date,"%m")) head(Power) Date YY XX Year Quarter 2007-01-01 NA NA 2007 Q1 2007-01-02…
jkl
  • 67
  • 1
  • 9
2
votes
0 answers

why does backwards selection in regsubsets (R, leaps package) yield nonsensical results after rearranging variables in data frame?

I am attempting to do forwards and backwards selection using the Boston data from the MASS package with the regsubsets() function in the leaps package in R and to compare the models selected of each size. I observe that I get different results in…
Rebecca
  • 21
  • 3
2
votes
2 answers

Cost function for linear regression with multiple variables in Matlab

The multivariate linear regression cost function: Is the following code in Matlab correct? function J = computeCostMulti(X, y, theta) m = length(y); J = 0; J=(1/(2*m)*(X*theta-y)'*(X*theta-y); end
Tamir
  • 23
  • 1
  • 6