I have a query regarding the extraction of VGG16/VGG19 features for my experiments.
The pre-trained VGG16 and VGG19 models have been trained on ImageNet dataset having 1000 classes (say c1,c2, ... c1000) and normally we extract the features from first and second fully connected layers designated ('FC1' and 'FC2'); these 4096 dimensional feature vectors are then used for computer vision tasks.
My question is that can we use these networks to extract features of an image that does not belong to any of the above 1000 classes ? In other words, can we use these networks to extract features of an image with label c1001 ? Remember that c1001 does not belong to the Imagenet classes on which these networks were initially trained on.
In the article available on https://www.pyimagesearch.com/2019/05/20/transfer-learning-with-keras-and-deep-learning/, I am quoting the following -
When performing feature extraction, we treat the pre-trained network as an arbitrary feature extractor, allowing the input image to propagate forward, stopping at pre-specified layer, and taking the outputs of that layer as our features
From the above text, there is no restriction to whether the image must necessarily belong to one of the Imagenet classes.
Kindly spare some time to uncover this mystery.
In the research papers, the authors simply state that they have used features extracted from VGG16/VGG19 network pre-trained on Imagenet dataset without giving any further details.
I am giving a case study for reference:
Animal with Attribute dataset (see https://cvml.ist.ac.at/AwA2/) is a very popular dataset with 50 animal classes for image recognition task. The authors have extracted ILSVRC-pretrained ResNet101 features for the above dataset images. This ResNet 101 network has been pre-trained on 1000 imagenet classes (different imagenet classes are available at https://gist.github.com/yrevar/942d3a0ac09ec9e5eb3a#file-imagenet1000_clsidx_to_labels-txt).
Also, the AWA classes are put as follows:
antelope, grizzly+bear, killer+whale, beaver, dalmatian, persian+cat, horse
german+shepherd, blue+whale, siamese+cat, skunk, mole, tiger, hippopotamus, leopard, moose, spider+monkey, humpback+whale, elephant, gorilla, ox, fox, sheep
seal, chimpanzee, hamster, squirrel, rhinoceros, rabbit, bat, giraffe, wolf, chihuahua, rat, weasel, otter, buffalo, zebra, giant+panda, deer, bobcat, pig, lion, mouse, polar+bear, collie, walrus, raccoon, cow, dolphin
Now, if we compare the classes in the dataset with 1000 Imagenet classes, we find that classes like dolphin, cow, racoon, bobcat, bat, seal, sheep, horse, grizzly bear, giraffe etc are not there in the Imagenet and still the authors went on with extracting ResNet101 features. I believe that the features extracted are generalizable and that is why authors consider these features as meaningful representations for the AWA images.
Your take on this ?
The idea is to get the representations for the images not belonging to ImageNet classes and use them along with their labels in some other classifier.