12

I have found many posts about the complexity of map and unordered_map. It is said that unordered_map has worst case complexity of O(N). For my purpose I will have input as sorted values like 1 2 5 6 9 11 12... I need to insert or find and delete a value. I will have to do insert/delete quite frequently. I thought of using set which has log(n) complexity in all cases. And then I stumbled upon unordered_map which has best O(1) complexity. But I need to understand in my scenario will I face the worst case scenario of unordered_map? And what will be the scenario?

EDIT: In my case all the values will be unique.

Tahlil
  • 2,680
  • 6
  • 43
  • 84

2 Answers2

11

unordered_map worst case usually happens when the hash function is producing collisions for every insertion in the map.

I said "usually" because the standard only specifies the worst case complexity, not when or how it will happen, so theoretically the answer to your question is that it is implementation defined.

Since all your values are unique, and apparently integers (which present very good hashing, possibly optimal _ this is again implementation dependent), you will not run into this worst case scenario. insert/find/delete will be O(1), so it looks like a reasonable choice.

quantdev
  • 23,517
  • 5
  • 55
  • 88
  • 2
    "when the hash function is producing collisions" - that makes it sound like the hash function in isolation... to be pedantic, it's when the hash function *mapped onto the buckets* (e.g. perhaps by `% bucket_count()`, though I don't believe that's mandated) has collisions. For example, if the hash function produces distinct values that are multiples of `bucket_count()` apart, they may collide. – Tony Delroy Jun 11 '14 at 06:23
  • Well yes, this is why I meant, how would you formulate it ? – quantdev Jun 11 '14 at 06:26
1

Depending on the implementation of the hashing algorithm, an ordered set of data might end up causing a lot of collisions when using an unordered_map. Since your data is ordered, it might be more advantageous to use a treeset.(Assuming you don't want the ability to add duplicate data.)

ByteByter
  • 86
  • 8
  • Which operation of are you referring to which has a worst case of O(n) for a balanced tree? Not insert, delete or lookup, those are all O(log n) worst case. – Benjamin Lindley Jun 11 '14 at 05:23
  • A well balanced tree, will afford you an average case complexity of O(log n) for deletion, insertion, and lookup; the worst case scenario is still n. However, the majority of the time operations will be O(log n). @Benjamin Lindley Depending on how the treeset is implemented, it could have o(n) time on occassion.(For example a bst) *http://bigocheatsheet.com/ – ByteByter Jun 11 '14 at 05:26
  • @ByteByter A balanced tree implementation, such as RB tree or AVL tree, is guaranteed to be O(log n) worst case. Your link says so. – n. m. could be an AI Jun 11 '14 at 05:45
  • @Benjamin Lindley Agreed, I was just saying that depending on how the data structure is coded it could potentially be worse. – ByteByter Jun 11 '14 at 18:41
  • @ByteByter: Only if it's coded in such a way that it no longer satisfies the definition of the data structure that it claims to be. But in that case, it could be much worse than O(n). It could be O(n^2) or O(2^n). But we wouldn't call it a balanced binary tree then, because it doesn't satisfy the performance requirements of one. Note that the bst you refer to, the one in your link I assume, that is a non-balanced tree. – Benjamin Lindley Jun 11 '14 at 20:38
  • @Benjamin Lindley That is true, I suppose I shouldn't have used the word balanced so liberally. – ByteByter Jun 12 '14 at 00:22