2

I am looking to train a large model (resnet or vgg) for face identification.

Is it valid strategy to train on few faces (1..3) to validate a model?

In other words - if a model learns one face well - is it evidence that the model is good for the task?

point here is that I don't want to spend a week of GPU expensive time only to find out that my model is no good or data has errors or my TF coding has a bug

Boppity Bop
  • 9,613
  • 13
  • 72
  • 151
  • Have you tried looking in model zoos for pretrained models? – Kermit Sep 14 '19 at 16:40
  • I wrote at the end of my answer my thoughts in a better manner. Anyway I will write my answer also here: A good performance on a small dataset don't tell you if your model when trained on all the dataset is a good model. That's why you train to the majority of your dataset and test/validate on a smaller dataset – Nikaido Sep 14 '19 at 20:42
  • I think a lot of confusion in the answers and comments are from interpretation of "good model" phrase. What I meant by "good" is "not broken". Usable. – Boppity Bop Sep 17 '19 at 14:01

2 Answers2

2

Short answer: No, because Deep Learning works well on huge amount of data.

Long answer: No. The problem is that learning only one face could overfit your model on that specific face, without learning features not present in your examples. Because for example, the model has learn to detect your face thanks to a specific, very simple, pattern in that face (that's called overfitting).

Making a stupid simple example, your model has learn to detect that face because there is a mole on your right cheek, and it has learn to identify it

To make your model perform well on the general case, you need an huge amount of data, making your model capable to learn different kind of patterns

Suggestion: Because the training of a deep neural network is a time consuming task, usually one does not train one single neural network at time, but many neural network are trained in parallel, with different hyperparameters (layers, nodes, activation functions, learning rate, etc).

Edit because of the discussion below:

If your dataset is small is quite impossible to have a good performance on the general case, because the neural network will learn the easiest pattern, which is usually not the general/better one.

Adding data you force the neural network to extract good patterns, that work on the general case.

It's a tradeoff, but usually a training on a small dataset would not lead to a good classifier on the general case

edit2: refrasing everything to make it more clear. A good performance on a small dataset don't tell you if your model when trained on all the dataset is a good model. That's why you train to the majority of your dataset and test/validate on a smaller dataset

Nikaido
  • 4,443
  • 5
  • 30
  • 47
  • I would disagree. The number of parameters per neurons and epochs in deep learning make it possible to learn more from less data. – Kermit Sep 14 '19 at 16:38
  • It's also not that time consuming if you are dealing with KB/MB of data. – Kermit Sep 14 '19 at 16:38
  • @hashrocketsyntax And how can you learn patterns that doesn't exists in your training set? – Nikaido Sep 14 '19 at 16:41
  • You can't. But just because you have a small amount of data doesn't mean the pattern is absent. – Kermit Sep 14 '19 at 16:42
  • If your dataset is small is quite impossible to have a good performance on the general case, because the neural network will learn the easiest pattern, not the general one. – Nikaido Sep 14 '19 at 16:44
  • adding data you force the neural network to extract good patterns, that work on the general case. It's a tradeoff, but usually a training on a small dataset would not lead to a good classifier on the general case – Nikaido Sep 14 '19 at 16:46
  • 1
    okay. if you edit to include that statement i will change to an upvote. i'm prevented from upvote without it being edited. point being, they work well on small data too. – Kermit Sep 14 '19 at 16:50
  • Ok before you go into a war :) I have enough data. I want to validate my model (make sure it works for the task at hand and doesn't fall into NaN after week of training) – Boppity Bop Sep 15 '19 at 17:17
  • 1
    @Boppity Bop If the problem is only about the code execution/debugging yes. I think that you can train on an little dataset to check if everything is ok. But if you want to validate your model (e.g. get optimal hyperparameters) you need to train on the majority of your training set. – Nikaido Sep 15 '19 at 17:35
  • As someone would point out: "There ain't no such thing as a free lunch" – Nikaido Sep 15 '19 at 17:51
1

For face recognition, usually a siamese net or triplet loss are used. This is an approach for one-shot learning. Which means it could perform really well given only few examples per class (person face here), but you still need to train it on many examples (different person faces). See for example:
https://towardsdatascience.com/one-shot-learning-with-siamese-networks-using-keras-17f34e75bb3d

You wouldn't train your model from scratch but use a pretrained model anyways and fine-tune it for your task

You could also have a look at pretrained face recognition models for better results like facenet
https://github.com/davidsandberg/facenet

cookiemonster
  • 1,315
  • 12
  • 19