0

Say You have thousands of images of cat, like this: (https://web.archive.org/web/20150703060412/http://137.189.35.203/WebUI/CatDatabase/catData.html). You wish to build a system that can look at a image and say - if the image is that of cat or not.

 What are the (if any) techniques to build such a model with a descent accuracy?

PS1: The key challenge in this problem is to see that "what is not a cat" is a huge universe - every image in this world that is not of cat qualifies for it. Formulating this problem as a binary classification is not good since it is near impossible to collect a "comprehensive" dataset of "what is not a cat". (if you do so, your model will be as good as your dataset of "what is not a cat")

PS2: Such a setting is called "One class classification"

Anuj Gupta
  • 6,328
  • 7
  • 36
  • 55
  • If you don't have anything to compare with the classification result will say that is similar to a cat with a high percentage because you have trained the system to know what a cat is. If you need to have a high accuracy then you need to train other classes in your system, of course, the result of "Non classified" will be the one that doesn't match with any of those. – Brank Victoria May 07 '19 at 11:37
  • Also, binary classification problems where the negative class is a lot bigger are not uncommon. Of course, your classifier won't have perfect accuracy, but other than that I don't see the problem. – T A May 07 '19 at 11:40
  • 1
    @BrankVictoria: pls read https://en.wikipedia.org/wiki/One-class_classification – Anuj Gupta May 07 '19 at 13:29
  • @TA: I have data of only one class! (images of cats) – Anuj Gupta May 07 '19 at 13:30

1 Answers1

0

One approach I can think of is to use One-Class SVM, which essentially is doing outlier detection.
In practice you can first apply a pre-trained CNN to extract a meaningful compact representation of the images and then use those vectors as input to a One-Class SVM. Everything non-cat will be an outlier!

marco romelli
  • 1,143
  • 8
  • 19