0

I'm having trouble calculating the y_pred in the least square regression. The idea is something like:

mydata <- read.csv("G:\\sample.csv",header=T)
x<-rep(mydata$wavelength,each=119)
y<-c(mydata$v1,....mydata$v119)
lm(y~x)

A sample data can be download at: https://drive.google.com/file/d/0B86_a8ltyoL3Y3BhU2xFVVo5dnM/view?usp=sharing

In the file, variable "Wavelength" is x, where for each x, there are multiple y measured at different times, as indicated by variables V1 to V119.

I'm not sure the y(multiple)~x(one) regression... Can someone help out to calculate y_pred in this case?

Big thanks!

Vicki1227
  • 489
  • 4
  • 6
  • 19
  • Either use `predict.lm` (but read the help page carefully) or grab the slope and intercept values from your result and hand-calculate new x,y pairs. – Carl Witthoft Oct 08 '14 at 15:30
  • Start by reading the help page for `lm` with the command `help('lm')`. –  Oct 08 '14 at 16:01

1 Answers1

-1

I think what you want to do is just

mydata <- read.csv("G:\\sample.csv",header=T)
lm(Wavelength ~ ., data = mydata)

This does a regression of Wavelength against all of the other columns in your dataframe.

In your call to

x<-rep(mydata$wavelength,each=119)

x ends up null. You need to capitalize Wavelength.

Sean Hughes
  • 328
  • 3
  • 7