Reduce dimensions of model's fully connected layer for image retrieval task

Question

I'm working on a image retrieval task(not involving faces) and one of the things I am trying is to swap out the softmax layer in the CNN model and use the LMNN classifier. For this purpose I fine tuned the model and then extracted the features at fully connected layer. I have about 3000 images right now. The fully connected layer gives a 4096 dim vector. So my final vector is a 3000x4096 vector with about 700 classes(Each class has 2+ images). I believe this is an extremely large dimension size which the LMNN algorithm is going to take forever(it really did take forever). How can I reduce the number of dimensions? I tried PCA but that didn't squeeze down the dimensions too much(got down to 3000x3000). I am thinking 256/512/1024 dim vector should be able to help. If I were to add another layer to reduce dimensions, say a new fully connected layer would I have to fine tune my network again? Inputs on how to do that would be great! I am also currently trying to augment my data to get more images per class and increase the size of my dataset.

Thank you.

have a look at [using SVD trick to reduce FC size](http://stackoverflow.com/a/40481789/1714410) — Shai, Jan 23 '17 at 09:32

score 2 · Answer 1 · answered Jan 23 '17 at 05:30

PCA should let you reduce the data further - you should be able to specify the desired dimensionality - see the wikipedia article.

As well as PCA you can try t-distributed stochastic neighbor embedding (t-SNE). I really enjoyed Wattenberg, et al.'s article - worth a read if you want to get an insight into how it works and some of the pitfalls.

In a neural net the standard way to reduce dimensionality is by adding more, smaller layers, as you suggested. As they can only learn during training, you'll need to re-run your fine-tuning. Ideally you would re-run the entire training process if you make a change to the model structure but if you have enough data it may be OK still.

To add new layers in TensorFlow, you would add a fully connected layer whose input is the output of your 3000 element layer, and output size is the desired number of elements. You may repeat this if you want to go down gradually (e.g. 3000 -> 1024 -> 512). You would then perform your training (or fine tuning) again.

Lastly, I did a quick search and found this paper that claims to support LMNN over large datasets through random sampling. You might be able to use that to save a few headaches: Fast LMNN Algorithm through Random Sampling

Thanks for your inputs, Mark! I did try PCA, but scikit only came down from 4096 to 3000. I will look at t-SNE too. I guess I will also try re-run fine tuning with lower set of dimensions. The random sampling article looks really interesting and I am definitely going to read that, and try to incorporate it. As my data has high number of classes, I might as well take some help from that algorithm. — rookie, Jan 23 '17 at 13:27

Reduce dimensions of model's fully connected layer for image retrieval task

1 Answers1