2

My objective is to classify images into one of a few predefined categories (SportShoes, Shirts, Heels, Watches..) from my catalog (and later on return similar images from the catalog).

I am using Dense-SIFT for feature extraction, representing each image using a Bag of Visual Words and SVM for classification. All my training images are taken from the catalog.

The problem is that the images that I am querying for are pictures taken from a camera and these look very different from the catalog images. For example, all the Heels/SportShoes in my catalog contain only right shoe taken at one particular angle, whereas my query image contains the Heel and a part of the foot as well, and the angle at which the photo is taken can vary (deviation from the catalog images).

Hence the classification works only when my query(test) image is an image from the catalog (those that I have NOT used for training), but not for images taken from the camera.

How do I proceed? Is it a problem with my feature vector or my training data itself? If I cannot change the training data, is there anything else I can use? Should I use a completely different approach (not bag-of-words) ?

Thanks

user3705926
  • 714
  • 2
  • 9
  • 14
  • You can enlarge the training data by adding affine transformations of the catalog images. Not sure how much that would help though. – GilLevi Jun 19 '14 at 08:15
  • Hi, defo. change the training data!, and if viewpoint is still a major issue, try sparse sift BoW instead. Also maybe try exemplar SVM with dense sift, and a whole bunch of different view examples. Exemplar SVM: http://www.cs.cmu.edu/~tmalisie/projects/iccv11/ . but in general the problem of generalization, especially to new view points is open. – QED Jun 23 '14 at 19:05
  • @QED sorry to comment on myself but just recalled that sparse sift may not be suitable for non-textured objects like shiny heals. – QED Jun 23 '14 at 19:23
  • Hi, thank you for the reply. I will check it out. What is your take on using convolutional neural networks for the same? – user3705926 Jun 24 '14 at 09:59

0 Answers0