0

Google's Dremel algorithm supports top-k queries. Could somebody tell me what algorithm that top-k query makes use of?

user152468
  • 3,202
  • 6
  • 27
  • 57

2 Answers2

1

like a Heap ?

A heap can be used to answer query asking for top k elements in a sorted list, in a O(nlogk) time.

see http://stevehanov.ca/blog/index.php?id=122

lavin
  • 2,276
  • 2
  • 13
  • 15
0

I guess you know about the Dremel Paper?

Here is a link: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/36632.pdf

It says:

Some Dremel queries, such as top-k and count-distinct, return approximate results using known one-pass algorithms (e.g., [4]).

The reference is the following:

[4] Z. Bar-Yossef, T. S. Jayram, R. Kumar, D. Sivakumar, and L. Trevisan. Counting Distinct Elements in a Data Stream. In RANDOM, pages 1–10, 2002.

Does this help?

user152468
  • 3,202
  • 6
  • 27
  • 57