1

I am trying to perform a regular validation on the iris data set in R to discover MSE, Quadratic MSE, and Cubic MSE.

# install.packages("class")
# install.packages("boot")
library("class")
library ("boot")
iris <- iris
train = sample(150, 75)
lm.fit = lm(Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width, data=iris, subset=train)
mean((iris -predict(lm.fit, iris))[-train ]^2)
# The estimated test MSE for the linear regression fit is...
lm.fit2 = lm(iris ~ poly(Sepal.Length + Sepal.Width + Petal.Length + Petal.Width, 2), data=iris, subset=train)
mean((Species - predict(lm.fit2, df))[- train]^2)
# The estimated test MSE for the quadratic regression fit is...
lm.fit3 = lm(iris ~ poly(Sepal.Length + Sepal.Width + Petal.Length + Petal.Width, 3), data=iris, subset=train)
mean((Species - predict(lm.fit3, df))[- train]^2)
# The estimated test MSE for the cubic regression fit is...

And this is the error I keep getting. How do I correct this so I do not get this error? I also tried changing to a vector to no avail, but seems it needs to stay a df for the lm?

Warning messages:
1: In model.response(mf, "numeric") :
using type = "numeric" with a factor response will be ignored
2: In Ops.factor(y, z$residuals) : ‘-’ not meaningful for factors
Samh200
  • 13
  • 3
  • 1
    These are warnings, not errors – desertnaut Oct 14 '21 at 21:46
  • 1
    You try to do regression using a factor as the response variable, which makes no sense. try to predict the `Species` in the `iris` dataset is a classification problem, so you should use a classification algorithm such as multivariate logistic regression or some machine learning algorithm such as random forest, supported vector machine, and other. From the help page of `lm`: A typical model has the form `response ~ terms` where the response is the (numeric) response vector. And as said by @desertnaut these are warnings, so code execution does not stop, and you obtain a result (not meaningful) – Elia Oct 14 '21 at 22:10
  • So one way around this is to switch the Species to numeric such as using 0 and 1, correct? Such as binomial, and then only predict one of them, rather than trying all three at once. – Samh200 Oct 15 '21 at 22:05

0 Answers0