0

I have a class managed by std::shared_ptr. This class has a hash and all the ==, <, etc. operators. For simplicity let's say that class is int. What I want is a registry of all int's currently in use that won't keep them alive longer than necessary, and has fast (faster than linear) lookup. This will be used to ensure I don't create two different int objects of the same number, e.g. before I create a new 42 I'll check if a 42 already exists in the registry. It seems that I want something like Java's WeakHashMap?

One possible solution is to use a std::unordered_set<std::shared_ptr<int>>, and periodically iterate through the set and delete any elements that have a shared_ptr::use_count() of 1. This is workable but not ideal since the objects stay alive longer than they need to.

Or I could use a std::vector<std::weak_ptr<int>>, which would free immediately, but would require iterating through the whole vector every time I want to see if a specific int already exists (slow).

A std::unordered_set<std::weak_ptr<int>> would satisfy both my requirements, except for the fact that the hashes would have to change once the weak_ptr expires, and "Container elements may not be modified since modification could change an element's hash and corrupt the container".

Is there a container with fast lookup that can handle changing hashes? Keep in mind it doesn't need to find the elements that changed, it just needs to not be corrupted by them.

imMAW
  • 173
  • 1
  • 1
  • 6
  • Maybe something like LRU or TLRU cache approach is what you need. Usually implemented with a hashtable and a doubly-linked list. See https://einarwh.wordpress.com/2011/04/13/a-simple-lru-cache/ – Sergey Dyshko Jul 25 '20 at 05:34
  • A `std::map`; `std::map>`. – Manuel Jul 25 '20 at 05:37
  • @Manuel perhaps my 'int' analogy breaks down here a bit, but this object is large and expensive. `std::map>` requires keeping a copy of `int` around as the key, which won't be freed when all the references go away. @SergeyDyshko, reading through that now. – imMAW Jul 25 '20 at 05:46
  • @imMAW can't you handle the key? It doesn't need to be the whole object, just a subset. How do you compare them? – Manuel Jul 25 '20 at 05:58
  • @Manuel Exact comparison does require the whole object, but the hash could be used as the key instead. I think in line with your thoughts, `std::multimap>` might do the trick (multimap in case of hash collisions). – imMAW Jul 25 '20 at 06:06
  • Why do you need to change the hash once it expires? Can't you simply check if it exists and is not expired as your check? – super Jul 25 '20 at 07:00
  • @super Assuming you're talking about `std::unordered_set>`, it's not that I want the hash to change, it's that I won't be able to compute the old hash. I need to supply the `unordered_set` with a hash function that takes a `std::weak_ptr`. I could implement this by doing: `if unexpired, return the hash of the int. else if expired, return 0`. Once the `weak_ptr` has expired, the hash function can't get the int, and won't be capable of computing the old hash. But it seems `unordered_set` will run into issues if I use that hash function. – imMAW Jul 25 '20 at 07:24
  • @imMAW You only need to compute the hash on insertion. Then it won't change. What do you mean you won't be able to compute the old hash? Additionally. If you want to check if `42` exists in the map before creating it, in your analogy `42` is the object. So how do you know if the object exists already before creating it? – super Jul 25 '20 at 08:13
  • @super I am planning on creating a duplicate `42` for the purpose of lookup and then immediately destroying it if one already exists. Not perfect, but at any moment there will be at most one unnecessary object taking up memory. As for the changing hash - the hash of an object in the `set` can be recomputed at any time the `set` wishes to do so, and it expects the hash to be the same every time it does so. See the bottom-most answer here for clarification: "Hash calculation must be stable" https://stackoverflow.com/questions/50402508/why-does-the-32769th-insert-fail-in-stdunordered-set – imMAW Jul 25 '20 at 08:31
  • With `std::unordered_set>` the 'object in the `set`' is a `std::weak_ptr`, so the `unordered_set` will expect the `std::weak_ptr` to hash to exactly the same thing every time it gets hashed. On insert my hash function will be able to hash the `weak_ptr to 42` "correctly", but if it wants to rehash after it becomes a `weak_ptr to nothing`, I won't be able to return the "correct" hash. – imMAW Jul 25 '20 at 08:35
  • I think going with `std::multimap>` will solve all the problems. Essentially it caches the hash, so whenever it wants to recompute the hash it doesn't need to go through the weak_ptr. It just means I need to manually call hash during insert/lookup, rather than letting the container call hash for me. – imMAW Jul 25 '20 at 08:49
  • *This will be used to ensure I don't create two different int objects of the same number, e.g. before I create a new 42 I'll check if a 42 already exists in the registry* So this part of your question is just wrong then I guess? Either way storing the hash and object separately in a map seems like a valid approach, but then you have to make sure you deal with hash collisions properly. – super Jul 25 '20 at 16:26

0 Answers0