-1

For example, if I want to train a model to classify "dog", "cat" and "neither dog nor cat". Do I need to prepare datasets for "neither dog nor cat"? Is there any way to accomplish it with only "dog" and "cat" datasets?

Asagao
  • 44
  • 6

3 Answers3

1

Yes you should have all 3. In theory you can use just 2 to train and then claim that it is "neither" if both logits are low. However you at least need "neither" DS to verify that your model works. Also if training with 3 sets it will be more robust and faster to train.

Poe Dator
  • 4,535
  • 2
  • 14
  • 35
  • Thanks for prompt advise. How do i know "logits are low" in multiclass classification as softmax activation function will bring 1 sum output? – Asagao Feb 07 '21 at 17:27
1

Yes, its recommended that labelled data has "other" type and an additional output neuron is added to infer other type

Lets start from a binary classifier for "dog" or "cat"

  1. Mostly softmax activation is used in the output layer
  2. It normalizes the result into one of the two classes
  3. Helps the user to decide on selection easily

Now let us add a 3rd neuron for "other", we need some data to correctly activate the "other"

Alternatively,

  1. Use a sigmoid with two neurons
  2. adjust prediction threshold such that if the dog and cat are both below their threshold, then emit neither

Though this alternate approach works, this may not be recommended as the custom logic outside the scope of model infers about additional class(which is not known to model).

In future, if some one adds, let say horse(along with dog and cat), the code needs to be modified. It seems to an unnecessary complexity in long run.

Thiyanesh
  • 2,360
  • 1
  • 4
  • 11
  • just learned sigmoid can be used with multiple neurons. Thank you for your answer! – Asagao Feb 07 '21 at 17:54
  • @Asagao, Thank you. We can think of sigmoid in output layer as an independent variable(does not care about states of other neurons in the output layer). Generally its not advised for classification as it adds responsibility on the developer to manage the thresholds incase of `single-label`(multiclass). Generally softmax might help in long run for single-label. If you are working on `multi-label`, then sigmoid might be a better choice than `softmax`. Disclaimer: i am a beginner in this field. – Thiyanesh Feb 07 '21 at 17:59
  • @Asagao, to be more clear about your statement and clear any misunderstanding. `just learned sigmoid can be used with multiple neurons`: sigmoid works only a single neuron. Its just that we have multiple neurons in output and each one with a sigmoid activation. Each of the sigmoid activated output neuron is independent of other neurons. They don't even know that others exist. Where as, 'softmax` has to know to about all the neurons to normalize all the possible outputs under probabilistic distribution (0..1). – Thiyanesh Feb 07 '21 at 18:02
  • thank you very much for further clarification, understood that 'Its just that we have multiple neurons in output and each one with a sigmoid activation'. – Asagao Feb 07 '21 at 18:57
  • Great. Have a great day – Thiyanesh Feb 07 '21 at 21:07
0

You could try having 2 output neurons like dog and cat and when training it with dog pictures you set the expected output to 10 and cat to 01, butit seems unlikely that when given a picture that has neither a cat or dog or will output 00.

There is a good chance it could work, I'm currently working with the MNIST Fashion dataset for a homework and the output is 10 classes and I use ReLu the entire way (meaning the output is 0 to infinity, not 0 to 1) and when class 7 is selected, usually the output layer will be [0 0 0 0 0 0 some high value 0 0 0], that means if I was to feed some arbitrary input, most likely the output would be close to 0. The problem for you will be that its very likely that the output will be non-zero and you would have to decide some cutoff where its unlikely that it is either a dog or cat.

Bruno CL
  • 149
  • 9