Consider the task of finding the top-k elements in a set of N independent and identically distributed floating point values. By using a priority queue / heap, we can iterate once over all N elements and maintain a top-k set by the following operations:
if the element x is "worse" than the heap's head: discard x ⇒ complexity O(1)
if the element x is "better" than the heap's head: remove the head and insert x ⇒ complexity O(log k)
The worst case time complexity of this approach is obviously O(N log k), but what about the average time complexity? Due to the iid-assumption, the probability of the O(1) operation increases over time, and we rarely have to perform the costly O(log k), especially for k << N.
Is this average time complexity documented in any citable reference? What's the average time complexity? If you have a citeable reference for your answer please include it.