Insertion Affects on Thread 2's Iterators/References
For std::map
and other ordered associative containers, insertion won't invalidate container iterators (or element references), so it
(and it->first
and it-second
) in thread 2 shouldn't be affected by thread 1's my_map.insert
. (Note: this is entirely due to the standard behavior for containers, not the locks. Though the locks ensure my_map.insert(...)
in thread 1 and my_map.begin()
and my_map.erase(...)
won't overlap, they don't affect the interaction of the insertion in thread 1 and accessing through the iterator in thread 2.) It's a different story for the unordered containers, such as std::unordered_set
and std::unordered_map
, which will invalidate iterators if they must rehash.
Other Operation Affects on Thread 2's Iterators/References
Since thread 1's insertion won't affect it->first
or it->second
, making local copies isn't necessary. With the addition of a synchronized my_map::extract
method (calling map::extract
, added by C++17), the extraction can be made atomic (as long as the element doesn't need to remain in the map while the work is done):
my_map::node_type my_map::extract(iterator position) {
lock_guard<decltype(my_map::mutex)> lock(mutex);
return map::extract(position);
}
void thread2() {
while(! my_map.empty()){
auto element = my_map.extract(my_map.begin())
work(element.key(), element.value());
}
}
If other threads were to remove elements from the map, however, it then would be necessary to ensure the data in thread 2 wasn't affected, as thread 2 could be interrupted between the calls to my_map::begin
and my_map::extract
, and an erasure might invalidate the passed iterator. That should be the only critical point in thread 2, as finding & removing the element from the map both early and atomically should ensure other threads won't be able to erase the node themselves (though thread 2 could affect iterators and references in other threads, which would need to be addressed in their own critical sections).
Protecting Against Erasure
A copy or extraction would need to be protected by e.g. locks to prevent an erasure from interrupting the operation; my_map
has just such a lock. Looking at it from another angle, removing and returning the first element is entirely an operation on my_map
, and is thus a candidate to be a method.
In other languages, this operation is called pop
or shift
. There are some similarly named STL containers methods (pop_back
and pop
), though they remove an element but don't return it, and map
doesn't have such a method. Let's call the method shift
:
my_map::node_type my_map::shift() {
lock_guard<decltype(my_map::mutex)> lock(mutex);
return map::extract(map::begin());
}
void thread2() {
while(! my_map.empty()){
auto element = my_map.shift()
work(element.key(), element.value());
}
}
Note there are some caveats about the elements returned by extract
:
Pointers and references to the extracted element remain valid, but cannot be used while element is owned by a node handle [node_type
]: they become usable if the element is inserted into a container.
See "std::unordered_map::extract references/pointers invalidation" for details, but the gist is you can access the data via the node_type::key()
and node_type::value()
methods (including assigning to them, which is why there's a caveat). The issue in general would be if other threads had a reference to the same element and it were modified via the node handle (which thread 2 doesn't do, but steps would still need to be taken for safety), it could cause issues in those other threads.
shift
above is a bit leaky in its return value. This could be addressed by returning a value type instead:
my_map::value_type my_map::shift() {
lock_guard<decltype(my_map::mutex)> lock(mutex);
auto node = map::extract(map::begin());
return value_type(std::move(node.key()), std::move(node.value()));
}
thread2() {
while(! my_map.empty()){
auto element = my_map.shift()
work(element.first, element.second);
}
}
Since node_type::key()
and node_type::value()
return lvalues, std::move
in my_map::shift
is used to convert these to rvalues, so that the returned pair grabs the data from the node, rather than making copies.
Alternatively, a lock could protect the call to work
as a critical section, though this would be less performant if work(...)
takes an appreciable amount of time. In general, copying/extracting reduces the critical section of code, so other threads don't need to wait for the real work to complete.
Erasure Affects on Thread 1
As thread 1 doesn't operate on elements of the map (only the map itself), there's nothing for thread 2's my_map.erase(...)
to affect in that regard. Even if thread 1 did operate on an iterator or reference to an item from my_map
, the erase would only invalidate erased elements (thread 1 would need to protect against this, as described above).