1

I'm using std::map to store a set of numbers in sorted order. I would like to be able to obtain a counter of numbers less or equal than some target number in O(1) time (not including the time it takes to find the maximum key that is <= target). Is there a way to do this?

As an example, say I have the following numbers (duplicates CAN exist) (1,0,5,2,3,8) stored as keys in std::map, and I have a target_key = 4. I want to use std::map to give me the solution 4 since there are 4 numbers in the array <=4.

I'm not sure if std::map is set up to do this. The only thing I can think of is using std::upper_bound to get me the iterator for the greatest value <=4, and then looping through the iterators and summing up the count of each key, but that's O(n) time.

EDIT 1:

Note that there may be duplicate numbers

EDIT 2:

I misstated that the search needs to be done in O(1) time. I understand that is not possible with any sorted container, and the best case is O(LogN). When I originally said O(1), I was envisioning already having found the maximum key that is <= target, and then using this key to find the count in O(1) time. I was also originally envisioning using std::map to store the key-value pair, where the value is a count of the occurrences of key, or better, yet, the value is the count of number of keys (duplicates included) that is <= target.

Community
  • 1
  • 1
24n8
  • 1,898
  • 1
  • 12
  • 25
  • 1
    Are you sure you don't want `std::set`? – François Andrieux Mar 05 '20 at 18:48
  • @FrançoisAndrieux Yes that wouldn't work I think because there can be duplicates. Sorry I forgot to mention in the OP. I'll edit it now. – 24n8 Mar 05 '20 at 18:49
  • `std::map` can't store duplicates. It sounds like you may want `std::multiset` then. – François Andrieux Mar 05 '20 at 18:50
  • Well...it's not like you a `std::map` would have uniqe keys or something like that. Just saying, maybe you want a `std::vector`?` to just store a sequence of numbers? Then you can search the number you want with binary search. – Lukas-T Mar 05 '20 at 18:51
  • I don't see how it's possible to have O(1) time for an arbitrary set of numbers. You're going to have to guess until you find the number you're interested in and you can't guess your way to an arbitrary location in constant time. I'll wager a guess that if you want this information in constant time, you have to be proactive in building something for it alongside your numbers. – chris Mar 05 '20 at 18:51
  • @FrançoisAndrieux Wow, I didn't know `std::multiset` and `std::unordered_multiset` existed! – 24n8 Mar 05 '20 at 18:51
  • The associative containers don't have random access iterators so determining how many elements come before a given iterator is always linear complexity. They also don't have a function dedicated to your task. So it doesn't seem like you can do this with the time complexity you are looking for. I would look into using a different container. A lot of people are suggesting a sorted vector which makes sense to me. – François Andrieux Mar 05 '20 at 18:54
  • I would suggest maintain a sorted vector. std::lower_bound gives you the insert position, which also happens to be the count of elements smaller. Basically NO's deleted answer with a container that supports O(1) distance. This solution also works with duplicate values. I was debating writing it up, or wait to see if NO posts a correction. – Kenny Ostrom Mar 05 '20 at 19:24
  • @KennyOstrom I'm not sure how to see a deleted answer? With a sorted vector, wouldn't insertion be O(N) since we'd have to displace all the elements after the insertion point, invalidating those iterators? – 24n8 Mar 05 '20 at 19:28
  • Yes. You'll have to decide on a data structure based on whether you have a lot of changes to the data structure, or if you have many more queries than actual changes. – Kenny Ostrom Mar 05 '20 at 19:46
  • @Iamanon You need 10k reputation to see deleted answers. – walnut Mar 05 '20 at 20:45

2 Answers2

4

This is called a rank query, and std::map does not provide any way of doing it in better than linear time.

It's true that you can find the iterator in logarithmic time but there is no way to get the distance between that iterator and begin() in better than linear time. In order for std::map to provide such functionality, it would have to keep a count of subtree size in each node of the red-black tree, which would slow down each update operation, including for users who don't care about doing rank queries.

If you do augment the tree with subtree sizes, you can do the rank query in logarithmic time. Boost.Intrusive can help (I imagine there are people who have already written the necessary libraries to do rank queries, but I can't find them on Google right now).

Brian Bi
  • 111,498
  • 10
  • 176
  • 312
0

I think there is no way in std::map/std::multimap to get distance between two iterators in constant complexity.
std::map/std::multimap offers LegacyBidirectionalIterator and using std::distance to find distance between two iterators have linear complexity.

Vikas Awadhiya
  • 290
  • 1
  • 8