0

Usually, algorithms as SIFT, SURF and many others provdies a set of k keypoints and the associated descriptor in d dimension (for example, in SIFT each descriptor has d=128 dimensions).

So, in order to describe an image we need a matrix kxd (k descriptor vectors, each one in d dimensions). So far so good.

My question is: how can we describe an image through a single vector?

This could be really useful since we could save a lot of space and because certain algorithms (like LSH) requires a vector as input/query.

In some papers (for example this, section 6.5) this approach is described as "global descriptors".

Up to know, I found only this paper but it doesn't seem so accurate (and it's from 2009, not so new).

UPDATE: Other possible solutions (some suggested in the comments):

  • Visual bag of words

  • gist descriptor

justHelloWorld
  • 6,478
  • 8
  • 58
  • 138
  • 2
    This does seem to be a programming question as such, it may be more appropriate for http://dsp.stackexchange.com/questions/tagged/image-processing – EdChum May 26 '16 at 08:21
  • 1
    Simplest thing is Bag of Words (BoW), where you basically aggregate local descriptors into a single fixed size vector. – Miki May 26 '16 at 08:24
  • @Miki Can you tell me more about them? – justHelloWorld May 26 '16 at 08:45
  • 1
    Google for that. Too broad to explain here.. It's a well known (and simple) approach, you'll find it easily. It's called also _bag of visual words_ – Miki May 26 '16 at 08:53
  • As I wrote on the **update** section, the gist descriptor is another solution – justHelloWorld May 26 '16 at 09:04

0 Answers0