12

I have a map with objects that needs to be released before clearing the map. I am tempted to iterate over the map and remove/release objects as I walk through it.

Here is a mock up example https://play.golang.org/p/kAtPoUgMsq

Since the only way to iterate the map is through range, how would I synchronize multiple producers and multiple consumers?

I don't want to read lock the map since that would make delete/modifying keys during the iteration impossible.

JeffJen
  • 121
  • 1
  • 4
  • We need more of the context to work with this. What does the map contain? Is it huge? How fast is releasing a thing? (What _is_ releasing?) Is it a disaster if a released object is used by another goroutine, and if so, what determines how long an object picked out of the map remains in use? What is the application and what are its priorities (e.g. maximizing throughput vs. limiting latency)? – twotwotwo Nov 06 '16 at 00:03

3 Answers3

7

There are a bunch of ways you can clean up things from a map without racy map accesses. What works for your application depends a lot on what it's doing.

0) Just lock the map while you work on it. If the map's not too big, or you have some latency tolerance, it gets the job done quickly (in terms of time you spend on it) and you can move on to thinking about other stuff. If it becomes a problem later, you can come back to the problem then.

1) Copy the objects or pointers out and clear the map while holding a lock, then release the objects in the background. If the problem is that the slowness of releasing itself will keep the lock held a long time, this is the simple workaround for that.

2) If efficient reads are basically all that matters, use atomic.Value. That lets you entirely replace one map with a new and different one. If writes are essentially 0% of your workload, the efficient reads balance out the cost of creating a new map on every change. That's rare, but it happens, e.g., encoding/gob has a global map of types managed this way.

3) If none of those do everything you need, tweak how you store the data (e.g. shard the map). Replace your map with 16 maps and hash keys yourself to decide which map a thing belongs in, and then you can lock one shard at a time, for cleanup or any other write.

There's also the issue of a race between release and use: goroutine A gets something from the map, B clears the map and releases the thing, A uses the released thing.

One strategy there is to lock each value while you use or release it; then you need locks but not global ones.

Another is to tolerate the consequences of races if they're known and not disastrous; for example, concurrent access to net.Conns is explicitly allowed by its docs, so closing an in-use connection may cause a request on it to error but won't lead to undefined app behavior. You have to really be sure you know what you're getting into then, though, 'cause many benign-seeming races aren't.

Finally, maybe your application already is ensuring no in-use objects are released, e.g. there's a safely maintained reference count on objects and only unused objects are released. Then, of course, you don't have to worry.

It may be tempting to try to replace these locks with channels somehow but I don't see any gains from it. It's nice when you can design your app thinking mainly in terms of communication between processes rather than shared data, but when you do have shared data, there's no use in pretending otherwise. Excluding unsafe access to shared data is what locks are for.

twotwotwo
  • 28,310
  • 8
  • 69
  • 56
4

You do not state all the requirements (e.g. can the release of multiple objects happen simultaneously, etc) but the simplest solution I can think of is to remove elements and launch a release goroutine for each of the removed elements:

for key := range keysToRemove {
    if v, ok := m[k]; ok {
        delete(m, k)
        go release(k, v)
    }
}
kostya
  • 9,221
  • 1
  • 29
  • 36
  • Does the method given in this answer actually suffice? (Esp. after the 1.6 changes?) Is there a place in the language specification where this is cleared up / definitively answered? – BadZen Nov 05 '16 at 20:08
  • In the code sample above the map is accessed from a single goroutine, so no synchronisation is necessary as there is no concurrent access. – kostya Nov 06 '16 at 09:56
  • Of course OP is assuming there will be concurrent writers during the iteration. He says: "I don't want to read lock the map since that would make delete/modifying keys during the iteration impossible." Just because he doesn't /show/ the writers doesn't mean they're not involved. – BadZen Nov 06 '16 at 13:30
  • I am not quite sure what the question is. If you are asking if it is safe to read/write a map from multiple goroutines without synchronisation then the answer is no. However sometimes it is possible to re-write the code to make access single threaded. That is what I attempted to show in my answer by moving `delete(m, k)` out of a separate goroutine. It is hard to tell if this answer meets OP requirements though as there are not enough of them in the question. – kostya Nov 08 '16 at 08:05
2

Update August 2017 (golang 1.9)

You now have a new Map type in the sync package is a concurrent map with amortized-constant-time loads, stores, and deletes.
It is safe for multiple goroutines to call a Map's methods concurrently.


Original answer Nov. 2016

I don't want to read lock the map

That makes sense, since a deleting from a map is considered a write operation, and must be serialized with all other reads and writes. That implies a write lock to complete the delete. (source: this answer)

Assuming the worst case scenario (multiple writers and readers), you can take a look at the implementation of orcaman/concurrent-map, which has a Remove() method using multiple sync.RWMutex because, to avoid lock bottlenecks, this concurrent map is dived to several (SHARD_COUNT) map shards.
This is faster than using only one RWMutex as in this example.

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250