0

I am looking at these snippets from CLRS:

Description of the hash table search time:

CLRS description of hash table search time when collisions are solved with chaining

Definition of as /:

CLRS definition of a (n/m) n being the number of keys effectively stored and m the size of the hashtable

Explanation of θ ( removing constant factors and only keeping the order of growth ):

CLRS explanation of θ ( removing constant factors and only keeping the order of growth )

I do not understand this usage of the theta notation. We define the load factor as /, being the number of elements stored in the hash table and being the number of cases in the hash table. So if we write θ(1+) this is exactly like if he write θ(1+/).

And θ(1+/) is by definition the set of functions such there exist positive constants 1, 2 and 0 such that:

      0 ≤ 1(1+/) ≤ () ≤ 2(1+/) for any ≥ 0

Which seems to be included in θ(), the set of functions such there exist positive constants 1, 2 and 0 such that:

      0 ≤ 1 ≤ () ≤ 2 for any ≥ 0

Moreover, there are two possibilities, either is a constant, either it is not. If is a constant, θ(1+) is like writing θ() if I am not wrong. If is not a constant it means it depends on , so θ(1+) is like writing θ() again ( because constants don't matter when you have a function and try to use the θ notation as the 3rd quote says ).

So in any possible case we could/should have written θ() instead of θ(1+), couldn't we?

I don't understand why there is a 1+ before the

trincot
  • 317,000
  • 35
  • 244
  • 286

1 Answers1

0

either is a constant, either it is not. If is a constant, θ(1+) is like writing θ() if I am not wrong. If is not a constant it means it depends on , so θ(1+) is like writing θ() again [...] So in any possible case we could/should have write θ() instead of θ(1+), couldn't we ? I don't understand why there is a 1+ before the

Your reasoning is correct, but a few remarks:

If is a positive constant we could also write θ(1+) as θ(1), which is more simplified than θ(), but you are right that θ(1+) can also be written as θ() in that case.

In general, you can indeed simplify θ(1+) to θ(), except for one obscure case:

When the hash stores no keys at all, then both and are zero. It is not stated in your document, but if this case needs to be covered, then / is undefined. If in that case is defined to be zero, then we still have a problem, because then θ() = θ(0), which would mean there is no work at all. This is not true, because even when a hash table is empty, it costs time to determine that a key is not in it, i.e. O(1).

I think it might be because of this boundary case that the author preferred not to simplify θ(1+) to just θ().

trincot
  • 317,000
  • 35
  • 244
  • 286
  • Hello, thank you a lot for your answer. I think I do now understand better but when you explain : "When the hash stores no keys at all, then both and are zero.", in fact you don't even need m to be 0, just n right ? Because if n = 0 then a = 0 so if we only write θ() the problem appears too. By the way, since a = n/m, if we assume that n and m can't be 0, writing θ(1+) is the same as θ() but is it the same than θ(n) in this case please ? Thank you by advance for your precious help – Ryan Carrier Aug 03 '22 at 17:08
  • Yes, it depends on how "slots" is defined. I took the point of view that an empty hash table has no slots, and so when would be zero, then it would follow that would be zero too. It however the definition of slots allows to be non-zero even when is zero, then the argument for having `1+` is even more evident. As to the second question, again, that depends on the definition of slots. If the number of slots is completely unrelated to , then we cannot exclude it from the expression. – trincot Aug 03 '22 at 17:27