3

I'm learning about using neural networks and object detection, using Python and Keras. My goal is to detect something very specific in an image, let's say a very specific brand / type of car carburetor (part of a car engine).

The tutorials I found so far use the detection of cats and dogs as example, and many of those use a pre-trained VGG16 network to improve performance.

If I want to detect only my specific carburetor, and don't care about anything else in the image, does it make sense to use VGG16.? Is VGG16 only useful when you want to detect many generic items, rather than one specific item.?

Edit: I only want to know if there is a specific object (carburetor) in the image. No need to locate or put a box around it. I have about 1000 images of this specific carburetor for the network to train on.

stop-cran
  • 4,229
  • 2
  • 30
  • 47
danv
  • 31
  • 2
  • 2
    By detect you mean specify if it is on the image or not or do you rather want to put box around this object? – Marcin Możejko Oct 20 '17 at 11:20
  • By detect I mean solely the fact whether the object is in the image or not. No need to put a box around it. – danv Oct 27 '17 at 18:42

2 Answers2

2

VGG16 or some other pretrained neural network is primarily used for classification. That means that you can use it to distinguish in what category the image belongs in.

As i understand, what you need is to detect where in an image a carburetor is located. For something like that you need a different, more complicated approach.

You could use

desertnaut
  • 57,590
  • 26
  • 140
  • 166
Ioannis Nasios
  • 8,292
  • 4
  • 33
  • 55
  • Thanks. Just a note: no need for me to find *where* in the image the carburetor is lcoated. I just want to know whether there is one in the image or not. – danv Oct 27 '17 at 18:44
1

The VGG 16 can be used for that. (Now is it the best? This is an open question without a clear answer)

But you must replace its ending to fit your needs. While a regular VGG model has about a thousand classes at its end, a cats x dogs VGG has its end changed to have two classes. In your case, you should change its ending to have only one class.

In Keras, you'd have to load the VGG model with the option include_top = False.

And you should then add your own final Dense layers (two or three dense layers at the end), making sure that the last layer has only one unit: Dense(1, activation='sigmoid').

This will work for "detecting" (yes / no results).
But if your goal is "locating/segmentation", then you should create your own version of a U-net or a SegNet, for instance.

Daniel Möller
  • 84,878
  • 18
  • 192
  • 214
  • Thanks, Daniel. I updated my question with the note that I would like to use "detecting" rather than "locating". – danv Oct 27 '17 at 18:56