I'm working on a image retrieval task(not involving faces) and one of the things I am trying is to swap out the softmax layer in the CNN model and use the LMNN classifier. For this purpose I fine tuned the model and then extracted the features at fully connected layer. I have about 3000 images right now. The fully connected layer gives a 4096 dim vector. So my final vector is a 3000x4096 vector with about 700 classes(Each class has 2+ images). I believe this is an extremely large dimension size which the LMNN algorithm is going to take forever(it really did take forever). How can I reduce the number of dimensions? I tried PCA but that didn't squeeze down the dimensions too much(got down to 3000x3000). I am thinking 256/512/1024 dim vector should be able to help. If I were to add another layer to reduce dimensions, say a new fully connected layer would I have to fine tune my network again? Inputs on how to do that would be great! I am also currently trying to augment my data to get more images per class and increase the size of my dataset.
Thank you.