I am trying to implement the algorithm by Jason Hipp et al. There is also a presentation, which is shorter and more comprehensive.
A brief description of their approach:
They use Vector Quantization as a tool to distinguish between foreground and backgroud in any given image. However, instead of using square regions as feature vectors to generate the Codewords, they use circles. This is supposed to decrease the computational complexity. With a circle as predicate vector, the matching problem is reduced to a linear pattern matching task and allows for spatially invariant matching. Hence the method is called Spatially Invariant Vector Quantization.
So basically, a predicate vector is chosen interactively and then the image space is queried exhaustively for the correlation of this predicate vector with the current position.
My questions are:
Where in the whole algorithm do they generate the Codebook? And how?
I cannot see how to choose the parameters for a Codebook to be generated. If they sample all possible circles in all possible positions in the image first, this is computationally extremely heavy. How do they determine the number of clusters/codewords to be generated?
Why would I wobble the sub-rings against each other?
Right now my implementation basically includes one circle with one radius as a predicate vector. It marches through the native image space and correlates the predicate vector with the circle around the current pixel in all possible rotations. This is an extremely slow process and I cannot see the benefits from their algorithm. I have not implemented anything that comes close to a Vector Quantization because I cannot see how this would work.
Any hint or thought is appreciated. The authors of the method didn't respond to my questions, unfortunately.