2

I'm working on a classification problem (predicting three classes) and I'm comparing SVM against Random Forest in R.

For evaluation and comparison I want to calculate the bias and variance of the models. I've looked up the two terms in many machine learning books and I'd say I do understand the sense of variance and bias (easiest explanation with the bullseye). But I can't really figure out how to apply it in my case.

Let's say I predict the results for a test set with 4 SVM-models that were trained with 4 different training sets. Each time I get a total error (meaning all wrong predictions/all predictions). Do I then get the bias for SVM by calculating this?

enter image description here which would mean that the bias is more or less the mean of the errors?

I hope you can help me with not to complicated formula, because I've already seen many of them.

SecretAgentMan
  • 2,856
  • 7
  • 21
  • 41
newbie96
  • 31
  • 2
  • It this a binary classification or multiclass? Bias-Variance decomposition is usually used for regression and squared error loss but there are some options for classification as well. – Szymon Maszke Feb 01 '20 at 13:26
  • It's multiclass. I predict three different states of a machine from aggregated load measurements – newbie96 Feb 01 '20 at 13:50
  • 1
    [Here](https://homes.cs.washington.edu/~pedrod/bvd.pdf) is a paper regarding multiclass and binary classification. You would be better off asking this question on [Stats Stack Exchange](https://stats.stackexchange.com/) though. – Szymon Maszke Feb 01 '20 at 13:59
  • Thank you!! I'll have a look at the paper and will also try it on Stats Stack Exchange :) – newbie96 Feb 01 '20 at 14:04

0 Answers0