0

Is it practical to create a dataset with noise off of just one image? Recently, I have asked here with regards to adding noise to images. I am aware that convolutional neural networks require datasets with thousands of images. However, my goal is to train a model off of just one image.

I intend to create a dataset of about 50 photos just by adding different levels of noise to a single image. Will I be able to get useful results out of it?

I hypothesize that this may not be viable for scratch training a CNN, but I think it'll work if I wanted to use Facenet.

For those who may not know what Facenet is, it is trained using triplet loss. It receives an image input and outputs an embeddings. This embedding can be used to compute distance metrics (specifically L2/Euclidean distance), wherein smaller measurements correspond to similarity and larger ones are different faces. It is trained on the LFW dataset to generalize facial features.

As I said I think that using my method as a substitute for datasets like LFW is stupid, but then again I haven't tried. But might it work for creating different but similar embeddings? I am in the process of trying it out but I want to hear what you guys think.

Jerome Ariola
  • 135
  • 1
  • 11
  • What's the goal of the net? – Mark Snyder Jan 22 '20 at 23:13
  • If you're referring to what I intend to create, it intends to create a dataset of 50+ images from a single image. This dataset is used in conjunction with a CCTV camera feed/feeds to find a target person. – Jerome Ariola Jan 22 '20 at 23:18
  • Ah, I see. The makers of FaceNet did do some noise testing in the form of testing on images with various levels of JPEG compression, and the net performed well. If you're using transfer learning, I would expect the majority of the noisy images to produce similar embeddings at first. What is your net actually trying to do with its 50-image dataset? Learn to distinguish the noise level? Because if it only trains on images of the same person, it's not going to learn to tell faces apart. – Mark Snyder Jan 22 '20 at 23:37
  • I am training my network (more of an SVM classifier) to identify a single face. I want to create a bounding box that is green if it detects a target, otherwise all non-target faces are red. If we pretend I was the FBI and I wanted to find, for example, you (Mr. Snyder) in a camera feed, but I only had one photo, I want to create 50+ different images. This is what the net trains on – Jerome Ariola Jan 23 '20 at 01:31
  • In simpler terms, it is trying to identify just one person in a crowd of people. But you did say it won't tell faces apart; can you clarify? My goal is just ignore all other faces except the target – Jerome Ariola Jan 23 '20 at 01:33
  • Sure. The goal makes sense. But if the only data that the classifier sees is the same face with several different levels of noise, then it doesn't really need to learn anything. It'll just classify everything as the target face. You'll need to give it data that isn't the target. Further, you'll probably need to give it images of faces that aren't the target (as opposed to just showing it environmental items). It might be worth splitting the process into two different steps: one that identifies faces in an image and then your classifier that detects the target face. – Mark Snyder Jan 23 '20 at 01:39
  • I have encountered this issue while using SVMs; I learned it needs a 2nd data point to reference to. For the sake of practice, I used myself. Encountering the error, I decided to create a dataset with Andrew Ng in it. Currently stuck on getting the SVM to work, but I thought I'd ask about the theory of my work here. I think I can get this thing to work but you might be right; I don't need to train my system on anything else; the embeddings that Facenet outputs can be measured from other embeddings. Distances correspond to similarity, but I wanted to create more photos for increased robustness. – Jerome Ariola Jan 23 '20 at 02:48

0 Answers0