2

I have to extract sift features of a dataset containing 1500 images that will be used later for Bag of Words. The result on one image has for example 3168 features needing MB of memory. Is saving all the features only the way? As each image results in a diferent dimensions of [frames, descriptors], what is a good way to save the result?

cagatayodabasi
  • 762
  • 11
  • 34

2 Answers2

0

You mention using the features for later. How are you currently planning to save the features to make them persistent? Doing feature extraction every time will work, but for 1500 images it will be a slow process.

One option you could use is k-means clustering to generate a codebook based on all of the features/descriptors. I did this for a corpus of 1100 images and my resulting codebook is < 1 Mb with 200 clusters. My codebook is saved as a serialized (pickled) object in python so it can be opened easily when needed.

Here is a crash course on k-means (assuming you are using Python): http://www.pyimagesearch.com/2014/05/26/opencv-python-k-means-color-clustering/

Are you using the bag of visual words from the 1500 images for a different dataset? Or are you going to use BoVW to match within the 1500 images?

Noah Christopher
  • 166
  • 1
  • 13
  • I would like to use a set of 1500 images for the training set (Bag of Visual Words) and test it on other images. Please correct me if my approach is incorrect. – wannabegeek Nov 21 '16 at 21:47
0

I guess there is an issue with the storage. as far as to what I have experienced. there is an error "unsufficient memory" in python while stacking descriptors of sift. OR the program memory cannot be overwrite the memory.. just a hypothesis

jude
  • 360
  • 3
  • 12