Double check lock optimization to implement thread-safe lazy-loading in Swift

Question

I have implemented what I think is a double check locking in a class to achieve thread safe lazy loading.

Just in case you wondered, this is for a DI library I'm currently working on.

The code I'm talking about is the following:

final class Builder<I> {

   private let body: () -> I

   private var instance: I?
   private let instanceLocker = NSLock()

   private var isSet = false
   private let isSetDispatchQueue = DispatchQueue(label: "\(Builder.self)", attributes: .concurrent)

   init(body: @escaping () -> I) {
       self.body = body
   }

   private var syncIsSet: Bool {
       set {
          isSetDispatchQueue.async(flags: .barrier) {
             self.isSet = newValue
          }
       }
       get {
          var isSet = false
          isSetDispatchQueue.sync {
              isSet = self.isSet
          }
          return isSet
       }
   }

   var value: I {

       if syncIsSet {
           return instance! // should never fail
       }

       instanceLocker.lock()

       if syncIsSet {
           instanceLocker.unlock()
           return instance! // should never fail
       }

       let instance = body()
       self.instance = instance

       syncIsSet = true
       instanceLocker.unlock()

       return instance
    }
}

The logic is to allow concurrent reads of isSet so the accesses to instance can be run in parallel from different threads. To avoid race conditions (that's the part I'm not 100% sure), I have two barriers. One when setting isSet and one when setting instance. The trick is to unlock the later only after isSet is set to true, so the threads waiting on for instanceLocker to be unlocked gets locked a second time on isSet while it's being asynchronously written on the concurrent dispatch queue.

I think I'm very close from a final solution here, but since I'm not a distributed system expert, I'd like to make sure.

Also, using a dispatch queue wasn't my first choice because it makes me think reading isSet isn't super efficient, but again, I'm no expert.

So my two questions are:

Is this 100% thread-safe and if not, why?
Is there a more efficient way of doing this in Swift?

This is very over-complicated. A dispatch queue is absolutely the correct tool; NSLock is the wrong tool. I'm trying to work out what exactly the goal is, however. Is the goal here to make calls to `body()` be serialized? Is `body()` not promised to be reentrant? Or is there another goal. Forget about `isSet`; what is `value` supposed to do here? (There is almost never a correct use of NSLock in Swift, ever. NSLock stopped being the right tool when GCD was released, long before Swift was created.) — Rob Napier, Jun 10 '18 at 00:56
Double-check locking is also never the right tool in Swift. (It is very hard to get right in other languages, but there's just no reason to even try in Swift.) You're so close, too. Your code around `isSet` is exactly right (I just don't think you need `isSet`) — Rob Napier, Jun 10 '18 at 00:59
The goal here is to have a barrier when setting the value for the first time, then avoid this barrier once the value has been instantiated. I don't want the reads of `instance` to be blocking the thread knowing it's not necessary 99% of the time. If `NSLock` is wrong, I guess I could use a `DispatchSemaphore` instead. I can't dispatch `body()` on a queue because I want it to be called on the same thread than the caller. — trupin, Jun 10 '18 at 01:06
Why is double-check locking never the right tool in Swift? I don't see any other options since `lazy var` isn't thread safe yet, there's a ticket for that (https://bugs.swift.org/browse/SR-1042) which is still opened. And things like `dispatch_once` doesn't exist in Swift (I'm not sure I could have used it here anyway...) — trupin, Jun 10 '18 at 01:10

Rob Napier · Accepted Answer · 2018-06-10T02:47:15.030

8

IMO, the correct tool here is os_unfair_lock. The point of double-check locking is to avoid the expense of a full kernel lock. os_unfair_lock provides that in the uncontested case. The "unfair" part of it is that it doesn't make promises to waiting threads. If one thread unlocks, it is allowed to relock without another waiting thread getting a chance (and thus could starve). In practice with a very small critical section, this is not relevant (in this case you're just checking a local variable against nil). It is a lower-level primitive than dispatching to a queue, which is very fast, but not as fast as an unfair_lock, since it relies on primitives like unfair_lock.

final class Builder<I> {

    private let body: () -> I
    private var lock = os_unfair_lock()

    init(body: @escaping () -> I) {
        self.body = body
    }

    private var _value: I!
    var value: I {
        os_unfair_lock_lock(&lock)
        if _value == nil {
            _value = body()
        }
        os_unfair_lock_unlock(&lock)

        return _value
    }
}

Note that you were correct to do synchronization on syncIsSet. If you'd treated it as a primitive (as is common in other double-check synchronizations), then you'd be relying on things that Swift doesn't promise (both the atomicy of writing Bools and that it would actually check the boolean twice, since there's no volatile). Given that you're doing synchronization, the comparison is between os_unfair_lock and dispatching to a queue.

This said, in my experience this kind of laziness is almost always unwarranted in mobile apps. It only actually saves you time if the variable is very expensive, but likely never accessed. Sometimes in massively parallel systems, being able to move the initialization is worthwhile, but mobile apps live on a fairly limited number of cores, so there generally isn't some extra core lying around to shunt this off to. I generally would not pursue this unless you've already discovered that it is a significant problem when your framework is used in live systems. If you have, then I recommend profiling your approach against os_unfair_lock in real-world usages that show this problem. I expect os_unfair_lock to win.

edited Jun 10 '18 at 02:47

answered Jun 10 '18 at 02:34

Rob Napier

286,113
34
456
610

Thanks! This is very helpful. I didn't know about `os_unfair_lock`. – trupin Jun 10 '18 at 06:37
I also replaced the `NSLock` with a `DispatchSemaphore` in my original solution, which is working well. Do you know how good a `DispatchSemaphore` is compared to an `NSLock`? From the documentation, it seems to be optimized, but it's not described in what way. – trupin Jun 10 '18 at 06:47
In any case, your solution works well and seems to be the most optimized. I don't have a real-world usage that shows this problem so far, but since I'm building a library, I'm trying to deal the worst case scenario, since I don't know how it's gonna get used, I also don't know how heavy `body()` will be since it's provided by the library's user. – trupin Jun 10 '18 at 06:56
Understood, but for a library this is probably exactly the wrong direction, because laziness has a non-trivial cost in the standard case (where it's the value is immediately used). If you don't need laziness, adding it makes performance worse. Better is to leave that kind of laziness to the caller rather than in the library until use cases accumulate that indicate you need to add an *optional* lazy solution that callers can use. – Rob Napier Jun 10 '18 at 13:37
DispatchSemaphore is probably pretty close to NSLock because they're both non-recursive fair locks (I would expect NSLock is built on DispatchSemaphore these days, but I haven't looked at the code recently). Both are generally slower than os_unfair_lock, which is a very low-level primitive (and so should only be used in cases where you are certain you need it; it's not a general-purpose solution). – Rob Napier Jun 10 '18 at 13:38
Thanks again for all these info! – trupin Jun 11 '18 at 17:46
1

You're probably right, doing lazy loading a default behavior is the wrong direction if I want devs to have the opportunity to use my library in heavily multi-threaded programs. The underlying reason why I'm using lazy loading right now is that I'm building a dependency container with an API allowing to register dependency builders taking parameters, which means I can't build the dependency upfront since I don't have the parameters upfront. So I think what I'll do is provide an additional API to register instances directly, which will delegate the building part to the caller. – trupin Jun 11 '18 at 17:46
I did't look at the code of `NSLock` or `DispatchSemaphore`, but, according to my understanding to semaphores, they are faster than locks because they internally employ conditional variables which help the waiting threads get re-scheduled more quickly. – CopperCash Apr 10 '19 at 06:45
1

To ensure your `os_unfair_lock` and other C locks are allocated on the heap and never moved or copied, you should allocate and initialize (and later deinitialize and deallocate) them with an `UnsafeMutablePointer`. Otherwise, they may stop working unexpectedly, crash in rare cases, and leak kernel resources, according to Philippe Hausler of the GCD team: https://forums.swift.org/t/atomic-property-wrapper-for-standard-library/30468/18 – Alan Kantz Dec 29 '20 at 04:33

Double check lock optimization to implement thread-safe lazy-loading in Swift

1 Answers1