0

I'm building an SVM prediction model in R and the dataset isn't supposed to lend itself to models with great accuracy/beta so I'm supposed to end with a poorly optimized model and spend time optimizing it. But it predicts at 100% accuracy with a Kappa of 1. I split it in half for training/testing and then run ksvm on it:

spor <- read.csv("spor.csv")
set.seed(12345)
idx <- sample(nrow(spor), 0.5*nrow(spor))
spor_train <- spor[idx,]
spor_test <- spor[-idx,]
spor_test_lab <- spor_test$alc
svm <- ksvm(as.factor(spor_train_lab) ~., data = spor_train, kernel = "vanilladot")
svm_pred <- predict(svm, spor_test)
confusionMatrix(svm_pred, as.factor(spor_test_lab))

I tried changing the ratio of train/test, testing on the whole dataset, whatever I do the model responds with 100% accuracy. I know there has to be a bug in here somewhere but I have no idea what it could be.

L Tyrone
  • 1,268
  • 3
  • 15
  • 24
ApeX
  • 1
  • 1
    Welcome to SO. It's much easier to answer questions that have a [minimum reproducible example](https://stackoverflow.com/help/minimal-reproducible-example). You can run `dput(spor)` and copy the result into your question so we can see your dataset. If your dataset is large, run `dput(head(spor, n))` where n is the minimum number of rows required to reproduce your issue. Thanks. – L Tyrone Apr 27 '23 at 01:21

0 Answers0