-1

I had the following dataset

library(MASS) 
install.packages("gclus")
data(wine)
View(wine)
install.packages("car")

I wanted to split it according to the proportions 70:30 into a training and a test set. Also I wanted to carry out LDA for the following data subsets

wine[c("Class", "Malic", "Hue", "Magnesium")] 
wine[c("Class","Hue", "Alcalinity", "Phenols", "Malic", "Magnesium", "Intensity", "Nonflavanoid","Flavanoids")]

Lastly, I was using the function predict to predict the class memberships for the test data, and compare the predictions with the true class memberships.

I am getting some errors while doing it, so any help would be appreciated.

nils
  • 25
  • 5

1 Answers1

2

First split the data in train and test 70:30 like this:

library(MASS) 
library(gclus)
set.seed(123)
ind <- sample(2, nrow(wine),replace = TRUE, prob = c(0.7, 0.3))
training <- wine[ind==1,]
testing <- wine[ind==2,]

Next, you can use the function lda to perform a Linear discriminant analysis like this:

model1 <- lda(Class ~ Malic + Hue + Magnesium, training)
model2 <- lda(Class ~ Hue + Alcalinity + Phenols + Malic + Magnesium + Intensity + Nonflavanoid + Flavanoids, training)

At last you can predict on testset and check the results with a confusion matrix like this:

p1 <- predict(model1, testing)$class
tab <- table(Predicted = p1, Actual = testing$Class)
tab

Output:

         Actual
Predicted  1  2  3
        1 13  3  0
        2  5 14  0
        3  0  2 11

The accuracy is:

cat("Accuracy is:", sum(diag(tab))/sum(tab))

Accuracy is: 0.7916667
Quinten
  • 35,235
  • 5
  • 20
  • 53