I am experimenting with SVM
function on iris
data. The objective is to extract the "class" of highest predicted probability for (1) each row (2) from the output matrix attr(pred_prob, "probabilities")
.
data(iris)
attach(iris)
x <- subset(iris, select = -Species)
y <- Species
model <- svm(x, y, probability = TRUE)
pred_prob <- predict(model, x, decision.values = TRUE, probability = TRUE)
attr(pred_prob, "probabilities")
(The original code came from this previous thread.)
The last line of code will give us an output of the following format:
setosa versicolor virginica
1 0.979989881 0.011347796 0.008662323
2 0.972567961 0.018145783 0.009286256
3 0.978668604 0.011973933 0.009357463
For ease of comparing these predicted probabilities with their real class "labels" (i.e., setosa, versicolor, virginica), I plan to extract the class of highest predicted probability for each row from the above output matrix. For example, the class of highest probability for the first observation is setosa
with predicted probability of 0.9799, which is returned from
which(attr(pred_prob, "probabilities")[1,] == max(attr(pred_prob, "probabilities")[1,]), arr.ind = TRUE)
I am now working on extending the above code into a loop in order to output a data column containing predicted class label for each observation in the data. Below is what I have so far, but I am having a hard time
predicted_class <- attr(pred_prob, "probabilities")
for(row in 1:nrow(predicted_class)) {
output <- print(which(predicted_class[row,] == max(predicted_class[row,]), arr.ind = TRUE))
output
}
But this does not give me what I intended it to be, it seems only to return the predicted class from a random row (while I want to a column of predicted classes for all observations). Could anyone enlighten me on this?