0

So I was posed this question:

(a) The training set contains 1000 observations on 7 covariates with the last (the 8th)column containing a continuous response variable. Predict the response variable from the covariates.

(b) The test set contains a further 500 observations on the 7 covariates. Provide predictions of the response using the model you chose in part (a).

I'm not sure if I'm doing this correctly. Ive read in the .csv files and did some regression. Here's what I've been trying:

    train.lm<-lm(y~., data=train)
    summary(train.lm)
    predict(train.lm, train)
    predict(train.lm, test)

Am I even on the right track?

Any help is greatly appreciated.

EDIT: small sample of the data: Data Sample

Karolis Koncevičius
  • 9,417
  • 9
  • 56
  • 89
Electrino
  • 2,636
  • 3
  • 18
  • 40
  • Looks OK, Is something wrong? Are you getting an error or a bogus answer? – G5W Jun 13 '18 at 19:25
  • Is this some sort of assignment? How do we know what the "correct" way is? We have no idea what your data looks like or what types of models are appropriate for your data. If you are only worried about "correctness", then you should ask the person who will be evaluating you. – MrFlick Jun 13 '18 at 19:28
  • Maybe I'm over thinking it... the question was posed on an entrance exam for a course... The other questions were quite tough and I thought this one was (out of character) a bit too easy... It's been a while since I did anything in R and I just wasn't sure if I was even going about it correctly – Electrino Jun 13 '18 at 19:38
  • For anyone to reproduce & check this, they'd need a sample of the data posted as text, not an image of a spreadsheet – camille Jun 15 '18 at 19:08

0 Answers0