3

I'm reading this paper Product quantization for nearest neighbor search.

On the last row of table II page 5 it gives

the complexity given in this table for searching the k smallest elements is the average complexity for n >> k and when the elements are arbitrarily ordered enter image description here

which is n+klogkloglogn.

I guess we can use linear selection algorithm to get unsorted k nearest neighbors with O(n), and sort the k nearest neighbors with O(klogk), so can we have O(n+klogk) in total. But where does the loglogn term come from?

The paper gives a reference to the TAOCP book for this, but I don't have the book at hand, could anyone explain it for me?

dontloo
  • 10,067
  • 4
  • 29
  • 50
  • It seems just a poor table formatting. The "find the k smallest distances" is `n+k log k` for SDC, and `log log n` for ADC. Since the columns are too close to each other, you read them together as `n+k log k log log n` – user31264 Jun 14 '16 at 05:38
  • @user31264 thanks! I was thinking about that too, but a `log log n` time complexity makes even less sense to me. – dontloo Jun 14 '16 at 05:42
  • @user31264 and dontloo, can you please take a look at my [question](http://stackoverflow.com/questions/38382217/k-reproduction-values)? – gsamaras Jul 14 '16 at 21:26

1 Answers1

2

First, Table II reports the complexity of each of the step, therefore you have to add all the terms to measure the complexity of ADC.

In the last line of the table, it is a single complexity both for SDC and ADC, which is:

n + k log k log log n

The term corresponds to the average algorithmic cost of the selection algorithm that we employ to find the k smallest values in a set of n variables, that we have copy/pasted from Donald Knuth book [25].

I don't have the book in hands to I can not check, but it sounds right.


From the authors

gsamaras
  • 71,951
  • 46
  • 188
  • 305
  • thank you, i'm not quite sure how it becomes `loglogn` in the ADC case, could you explain that a bit please? thanks! – dontloo Jun 14 '16 at 14:50
  • See my update @dontloo, that should be enough, if you have a new question, post a new one. :) – gsamaras Jun 14 '16 at 18:33
  • thank you so much for the clarification, and that is where I'm confused, I can only make sense of `n + k log k`, which is the closest form to this formula I can get. anyways, thank you and I think I'll post a clearer question beyond the context of this paper. – dontloo Jun 15 '16 at 02:03
  • just wondering where did you find this? have you contacted the authors? – dontloo Jun 15 '16 at 02:10
  • As I have written in my answer, yes @dontloo! :) – gsamaras Jun 15 '16 at 15:33