2

I've come across an interesting issue using mutable dictionaries on background threads.

Currently, I am downloading data in chunks on one thread, adding it to a data set, and processing it on another background thread. The overall design works for the most part aside from one issue: On occasion, a function call to an inner dictionary within the main data set causes the following crash:

*** Collection <__NSDictionaryM: 0x13000a190> was mutated while being enumerated.

I know this is a fairly common crash to have, but the strange part is that it's not crashing in a loop on this collection. Instead, the exception breakpoint in Xcode is stopping on the following line:

NSArray *tempKeys = [temp allKeys];

This leads me to believe that one thread is adding items to this collection while the NSMutableDictionary's internal function call to -allKeys is enumerating over the keys in order to return the array on another thread.

My question is: Is this what's happening? If so, what would be the best way to avoid this?

Here's the gist of what I'm doing:

dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^(void) {

    for (NSString *key in [[queue allKeys] reverseObjectEnumerator]) { //To prevent crashes

        NEXActivityMap *temp = queue[key];
        NSArray *tempKeys = [temp allKeys]; //<= CRASHES HERE
        if (tempKeys.count > 0) {
            //Do other stuff
        }
    }
});
rebello95
  • 8,486
  • 5
  • 44
  • 65
  • `-allKeys` (`NSDictionary`) is a loop. There is a dispatch async. Do you change the dictionary in another thread? – Amin Negm-Awad Mar 08 '16 at 21:36
  • @AminNegm-Awad There is a call on another thread that adds entries to the map. Is there a way to either lock this element while it's being enumerated or find the keys another way? – rebello95 Mar 08 '16 at 21:38
  • You can use a `synchronized` block but you need to look carefully at your data access in a multithreaded environment. Generally it is better for your task that is loading new data and your task that is processing downloaded data to use separate data structures rather than a shared object. E.g.download data, pass `queue` to be processed, the next chunk of data is downloaded into a different queue object and then this is passed to the processing task and so on – Paulw11 Mar 08 '16 at 21:42
  • You have to put *all* accesses into the same (serial) dispatch queue. Likely you want to create a private queue for that. – Amin Negm-Awad Mar 08 '16 at 21:51
  • Thus far it looks like adding `@synchronize` around either/or of the statements (`-allKeys` or the function adding elements) does *not* solve the problem. I just added `@synchronize` blocks around *both* sections of code, and it's working thus far. Still testing... any reason this wouldn't work @AminNegm-Awad? – rebello95 Mar 08 '16 at 22:04

3 Answers3

3

You can use @synchronize. And it will work. But this is mixing up two different ideas:

  • Threads have been around for many years. A new thread opens a new control flow. Code in different threads are running potentially concurrently causing conflicts as you had. To prevent this conflicts you have to use locks like @synchronized do.

  • GCD is the more modern concept. GCD runs "on top of threads" that means, it uses threads, but this is transparent for you. You do not have to care about this. Code running in different queues are running potentially concurrently causing conflicts. To prevent this conflicts you have to use one queue for shared resources.

You are already using GCD, what is a good idea:

dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^(void) {

The same code with threads would look like this:

[[NSThread mainThread] performSelector:…];

So, using GCD, you should use GCD to prevent the conflicts. What you are doing is to use GCD wrongly and then "repair" that with locks.

Simply put all accesses to the shared resource (in your case the mutable dictionary referred by temp) into on serial queue.

  1. Create a queue at the beginning for the accesses. This is a one-timer.

You can use one of the existing queues as you do in your code, but you have to use a serial one! But this potentially leads to long queues with waiting tasks (in your example blocks). Different tasks in a serial queue are executed one after each other, even there are cpu cores idle. So it is no good idea to put too many tasks into one queue. Create a queue for any shared resource or "subsystem":

dispatch_queue_t tempQueue;
tempQueue = dispatch_queue_create("tempQueue", NULL);
  1. When code wants to access the mutable dictionary, put it in a queue:

It looks like this:

dispatch_sync( tempQueue, // or async, if it is possible
^{
  [tempQueue setObject:… forKey:…]; // Or what you want to do.
}

You have to put every code accessing the shared resource in the queue as you have to put every code accessing the shared resource inn locks when using threads.

Community
  • 1
  • 1
Amin Negm-Awad
  • 16,582
  • 3
  • 35
  • 50
1

From Apple documentation "Thread safety summary":

Mutable objects are generally not thread-safe. To use mutable objects in a threaded application, the application must synchronize access to them using locks. (For more information, see Atomic Operations). In general, the collection classes (for example, NSMutableArray, NSMutableDictionary) are not thread-safe when mutations are concerned. That is, if one or more threads are changing the same array, problems can occur. You must lock around spots where reads and writes occur to assure thread safety.

In your case, following scenario happens. From one thread, you add elements into dictionary. In another thread, you accessing allKeys method. While this methods copies all keys into array, other methods adds new key. This causes exception.

To avoid that, you have several options.

Because you are using dispatch queues, preferred way is to put all code, that access same mutable dictionary instance, into private serial dispatch queue.

Second option is passing immutable dictionary copy to other thread. In this case, no matter what happen in first thread with original dictionary, data still will be consistent. Note that you will probably need deep copy, cause you use dictionary/arrays hierarchy.

Alternatively you can wrap all points, where you access collections, with locks. Using @synchronized also implicitly create recursive lock for you.

Borys Verebskyi
  • 4,160
  • 6
  • 28
  • 42
  • Putting `@synchronize` blocks around both sections of code (read and write) appears to have resolved the problem. Thanks for the explanation! – rebello95 Mar 08 '16 at 23:01
  • `@synchronize` (aka locks) is for threads. You are using dispatch queues, which is the more modern approach. `@syncrhonize` works, cause it prevents concurrent accesses, if the code is running on different queues in different threads. But it is mixing up two technologies. *Queues are made to have no locks anymore.* But, of course, you have to use them always. (As you have to use locks always.) – Amin Negm-Awad Mar 09 '16 at 05:38
0

How about wrapping where you get the keys AND where you set the keys, with @synchronize?

Example:

- (void)myMethod:(id)anObj
{
    @synchronized(anObj)
    {
        // Everything between the braces is protected by the @synchronized directive.
    }
}
kvr
  • 573
  • 2
  • 8