1

I need to get an SVM classifier of the ground. I don't have a lot of experience with SVM, so I was wondering, just by a cursory glance at these data sets ( http://archive.ics.uci.edu/ml/datasets.html ), whether there is one in particular I should be using.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Walrus the Cat
  • 2,314
  • 5
  • 35
  • 64

1 Answers1

2

Here you have a really good answer with datasets samples and a good explanation of SVM: Datasets to test Nonlinear SVM

From the list ( http://archive.ics.uci.edu/ml/datasets.html ) I think you should try with the Iris Dataset for multiclass classification or with Skin Segmentation Dataset for binary classification. I think is a good start with enough and continuous data to test SVM

Community
  • 1
  • 1
marc_ferna
  • 5,837
  • 1
  • 18
  • 22
  • Nice, thanks. I'll accept that one, if I don't get (or create) a simple "CLICK THE LINK THESE ONES" answer. For posterity. – Walrus the Cat Aug 29 '12 at 18:46
  • I'm testing one dataset for you also, I'll update this answer soon ;) – marc_ferna Aug 29 '12 at 18:50
  • @WalrustheCat You can also take a look to an old code that I have about an AdaBoost implementation in Java to detect spam. Don't judge...was a couple years ago ;) http://code.google.com/p/spamdetection/source/browse/trunk/src/adaBoost.java – marc_ferna Aug 31 '12 at 00:28
  • judge your code? no wai. but your answer? maybe. isn't the iris dataset multivariate, and doesn't SVM lend itself to binary classification tasks? don't you have to do a lot of meta-work to do multivariate predictions in SVM (1 vs 2, 2 vs 3 1 vs winner of 2 vs 3? etc) i chose the 'skin' data set from UCI and i didn't have to do any of that. am i right? why is iris better? muchos gracias – Walrus the Cat Aug 31 '12 at 23:01
  • The Iris is a multiclass problem - it totally can be solved with SVM though. You are right, for binary classification the one you chose is good! – marc_ferna Aug 31 '12 at 23:18