What's the difference between "Bag of Words" and "Bag of features" in computer vision?

Question

Researching the subject, one can find papers where the author makes image classification / retrieval using the "Bag of Words" model, while others do similar tasks using a "Bag of features" model.

Even though I have a basic understanding of the method involved (detect and extract visual words, build a visual dictionary, use machine learning to train a classifier), I still can't see the difference between both models. Are they synonyms? Maybe I have missed concrete examples / documentation that shows the difference...

score 7 · Answer 1 · answered Aug 25 '13 at 10:25

At first there was the Bag of Words model for document retrieval. This model considered every document (and the query too) to be a bag of words (without taking the position of each word into account). So every document was transformed into a vector of the size of the language dictionary keeping the frequency of each term (histogram)

The Bag of Visual Words or Bag of Features replace the document with an image and the words with features (or "Visual Words") and create a very similar representation of an image. So yes the BoF is synonym of the BoVW. The BoW is about text retrieval.

What's the difference between "Bag of Words" and "Bag of features" in computer vision?

1 Answers1