2

I am reading about Product Quantization, from section II.A page 3 of PQ for NNS, that says:

..all subquantizers have the same finite number k* of reproduction values. In that case the number of centroids is (k*)^m

where m is the number of subvectors.

However, I do not get k* at all! I mean in vector quantization we assign every vector to k centroids. In produce quantization, we assign every subvector to k centroids. How did k* come into play?

gsamaras
  • 71,951
  • 46
  • 188
  • 305

1 Answers1

1

I think k* is the number of centroids in each subspace, and k is the number of centroids in the whole space.

For example if the data is 2d, like (x, y), and we treat each dimension as a subspace, and do kmeans with say k*=3 respectively, we'll get 3 centroids in each subspace, {x1, x2, x3} and {y1, y2, y3}.

Then there'll be 3^2=9 possible centroids in the whole space, which are* (x1, y1), (x1, y2), (x1, y3), (x2, y1)...

In this way we can get a large number of centroids (2^64 in the paper) using a small amount of memory, because we don't have to store all k*^m centorids, we only need to store k* centroids in each subspace.

Edit:
In above the example, the number of subspaces m=2, number of centroids in each subspace k*=3, number of centroids the whole subspace k=3^2, number of dimensions of each subspace D*=1, number of floating points to store mD*k*=Dk*=6.


*the cartesian product of x and y

dontloo
  • 10,067
  • 4
  • 29
  • 50
  • Thus we have to store `m * k*` centroids in each subspace, right? Or you omitted that since `m = 1` in your example (isn't that the value of `m` in your example)? Thanks for the answer. :) – gsamaras Jul 15 '16 at 01:47
  • @gsamaras I think it's `k*` for each subspace, `mk*` in total, as said at the bottom of page 3 Instead, > we store the m × k* centroids of all the subquantizers. – dontloo Jul 15 '16 at 01:53
  • Oh missed that, yes surely that's it, I agree! I also assume that yes, `m = 1` in your example. – gsamaras Jul 15 '16 at 01:58
  • @gsamaras hmm I think it's `m=2` in my example (m is the number of subspaces), because I used each dimension as a subsapce. Maybe what you mean is `D*=1` (D* is the dimensions of each subspace). – dontloo Jul 15 '16 at 02:05
  • Exactly! You might want to address that in your answer, thanks again! – gsamaras Jul 15 '16 at 02:06
  • I posted the last relevant question [here](http://stackoverflow.com/questions/38388748/why-we-need-a-coarse-quantizer), if you have time, please take a look (notice this is different from the previous comment, done 2h ago). – gsamaras Jul 15 '16 at 05:54