2

I have a performance-sensitive code, working with frames of video-playback in real time. I have some work here which can be parallelised and since it's a performance-sensitive code where latency is the key I decided to go with NSThread instead of GCD.

What I need: I need to have NSThread which will be scheduled with some work at a certain period of time. After the thread is done with its work, it goes to sleep till new work arrives.

Unfortunately, there're not so much info about correct techniques of NSThread usage in the internet, so I assembled my routine based on information bits I managed to find.

You can find the whole workflow above:

1) Init my NSThread. This code is launched only once as it's supposed to.

_myThread = [[NSThread alloc] initWithTarget:self selector:@selector(_backgroundMethod) object:nil];
 _myThread.threadPriority = 0.8;    //max priority is 1.0. Let's try at 0.8 and see how it performs
[_myThread start];

2) _backgroundMethod code:

- (void)_backgroundMethod
{
    NSLog(@"Starting the thread...");
    [NSTimer scheduledTimerWithTimeInterval:FLT_MAX target:self selector:@selector(doNothing:) userInfo:nil repeats:YES];

    BOOL done = false;
    NSRunLoop *runLoop = [NSRunLoop currentRunLoop];
    do {
        [runLoop runMode:NSDefaultRunLoopMode beforeDate:[NSDate distantFuture]];
    } while (!done);
}

- (void)doNothing:(NSTimer *)sender { }

3) When thread has something to work with I make the next call:

[self performSelector:@selector(_doSomeCalculation) onThread:_myThread withObject:nil waitUntilDone:NO];

Which calls next method:

- (void) _doSomeCalculation
{
    //do some work here
}

So my question would be:

1) When I init NSThread, I pass a selector. What's the purpose of that selector? As far as I understand, the only purpose of this selector is controlling thread's RunLoop and I should not do any calculations here. So I'm engaging NSRunLoop with an infinite timer just to keep it alive without constant running a while loop. Is that right approach?

2) If I can do a calculation in the selector I'm passing at NSThread init phase - how can I signal NSRunLoop to do one loop those without using performSelector? I think I should not pass the exact same method with performSelector because it would be a mess, right?

I've read a lot of info Apple provided but all of this is almost theoretical and those code samples which provided confused me even more..

Any clarification would be very appreciated. Thanks in advance!

EDIT:

Also a question - how I can calculate a desired stackSize for my thread? Is there any technique to do that?

Eugene Alexeev
  • 1,152
  • 12
  • 32
  • 1
    Bad decision. Use `GCD`. It was designed to "kill" `NSThread`. It's have all that you need. You are now just reinventing GCD. You can use direct `GCD` or Objective-C's wrapper of it `NSOperationQueue` – Cy-4AH Sep 13 '19 at 10:48
  • 1
    Hello! Here's what Apple says about `NSThread` in the article `Migrating Away from Threads` - `It is important to remember that queues are not a panacea for replacing threads. The asynchronous programming model offered by queues is appropriate in situations where latency is not an issue. Even though queues offer ways to configure the execution priority of tasks in the queue, higher execution priorities do not guarantee the execution of tasks at specific times. Therefore, threads are still a more appropriate choice in cases where you need minimal latency, such as in audio and video playback.` – Eugene Alexeev Sep 13 '19 at 10:51
  • 1
    So I want to point out that I decided to try `NSThread` because I had certain problems with `GCD` in that particular case - like latency decrease over a time and uneven performance on different devices. I need more control on how my task is performed and learn how to do it properly – Eugene Alexeev Sep 13 '19 at 10:52
  • I think Apple mean here that you perform plain sequent work here in thread: take data from input, process it, put it in output. Not such scheduling mechanism when you create run loops and performs selectors. Performing selectors also takes more time than direct function call, you should think about it if latency such matter. – Cy-4AH Sep 13 '19 at 11:04
  • Take data from input, process it and put in to output - that is literally what I'm doing. And scheduling mechanism is meant here only to keep thread and its run loop alive and asleep once I need to use it again. – Eugene Alexeev Sep 13 '19 at 11:11
  • 1
    I will have to go with @Cy-4AH on this one. Either use GCD to do all your work on single queue or use `NSThread` without run loop for each of your process. It seems you have more than one. In both cases you WILL produce latency when the work you need to do in this thread(s) will be too much to handle. Either amount of opened threads will increase or the previous task will not yet be finished and the new one will be scheduled thus producing latency. The later is basically what GCD should do. – Matic Oblak Sep 13 '19 at 11:55

1 Answers1

2

since it's a performance-sensitive code where latency is the key I decided to go with NSThread instead of GCD.

You should generally not do that unless you have a solid understanding of GCD and know exactly what you're giving up. Used correctly, GCD is highly optimized, and integrated very closely with the OS. It's particularly surprising to be using NSThread by hand, but then doing your work in ObjC with performSelector and run loops. Calling performSelector this way is introducing the same kind of unknown latency as a GCD serial queue. If the thread is already busy, then you'll queue the selector, exactly like queuing a block (but you'll add the overhead of objc_msgSend). A GCD concurrent queue would perform better. In order to match GCD, you would need to implement a proper thread pool (or at least add cancelation). Done well, this can be better than GCD for specific use cases, but it has to be done very well, and that's complicated.

As the Threading Programming Guide notes:

Note: Although good for occasional communication between threads, you should not use the performSelector:onThread:withObject:waitUntilDone: method for time critical or frequent communication between threads.

If you want low-latency communication between threads, you'll typically want to use semaphores such as NSConditionLock rather than a runloop.

That said, let's get to the actual questions.

The performSelector:onThread:... interface is generally used for one-shot operations. The more common way to implement a long-running, dedicated thread is to subclass NSThread and override main. Something like this (this is thrown together and untested, based on code from the Threading Programming Guide; I've done all my high-performance work in GCD for years now, so I probably have goofed something here).

#import "WorkerThread.h"

#define NO_DATA 1
#define HAS_DATA 2

@implementation WorkerThread
static NSConditionLock *_condLock;
static NSMutableArray *_queue;

+ (void)initialize
{
    if (self == [WorkerThread class]) {
        _condLock = [[NSConditionLock alloc] initWithCondition:NO_DATA];
        _queue = [NSMutableArray new];
    }
}

- (void)main
{
    // Until cancelled, wait for data, and then process it.
    while (!self.cancelled)
    {
        [_condLock lockWhenCondition:HAS_DATA];
        id data = [_queue firstObject];
        [_queue removeObjectAtIndex:0];
        [_condLock unlockWithCondition:(_queue.count == 0 ? NO_DATA : HAS_DATA)];

        // Process data
    }
}

// Submit work to the queue
+ (void)submitWork:(id)data {
    [_condLock lock];
    [_queue addObject:data];
    [_condLock unlockWithCondition:HAS_DATA];
}

@end

You'd spawn up some threads like:

workers = @[[WorkerThread new], [WorkerThread new]];
for (worker in workers) {
    [worker start];
}

[WorkerThread submitWork: data];

And you'd shut down threads like:

for (worker in workers) {
    [worker cancel];
}
Rob Napier
  • 286,113
  • 34
  • 456
  • 610
  • Thank you for clarification! My main concern with `GCD` is that it's internal mechanic can adjust `QoS` automatically and I don't really know how to influence that except putting all work at lower `QoS` from the beginning. What approach can I use to make `GCD` happy about my `QoS` choice and thread workload? – Eugene Alexeev Sep 13 '19 at 14:55
  • 1
    You should choose the QoS that matches your intent. If this is real-time processing that influences the UI then you should mark these work items `.userInteractive`. That happens to be the highest QoS, but QoS is more than just priority. You shouldn't be trying to tweak some number to cause things to process in an order (if you need an order, use a queue). You should mark them according to their intent. – Rob Napier Sep 13 '19 at 15:03
  • 1
    If you have a question about real-time processing with GCD, I recommend opening a new question about that, detailing the specific problem you're seeing. If this is really real-time work, then it's worth discussing how you deal with dropped frames, as well. (If you don't have a mechanism to drop work to make your deadlines, then they're not really deadlines, and it's not really "real-time.") – Rob Napier Sep 13 '19 at 15:03
  • If I'm not wrong, `AVCaptureVideoDataOutput` does the frame-dropping for me. If I don't keep up with my work I see `captureOutput(_ output: AVCaptureOutput, didDrop sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) ` getting called and hence I don't process "late frames". Or you're referring to something else? – Eugene Alexeev Sep 13 '19 at 15:43
  • 1
    Depending on your situation, that may be sufficient. If you've sent a bunch of things to multiple threads, though, you may need to abandon processing that's run past its deadline, or you may never catch up once dropping begins. The capture connection needs to know that you're behind, and it won't automatically know that if you've spun things off to other threads. – Rob Napier Sep 13 '19 at 15:51
  • As a rule, it's often much easier and more reliable to keep your processing on the sampleBufferCallbackQueue, and focus on making it more performant. Often it's better to make things parallel on the GPU or vector processor rather than spawning extra CPU threads. But it depends on your specific situation. – Rob Napier Sep 13 '19 at 15:53
  • Oh, I understood what you meant. I really don't consider the fact that processing wasn't finished on time. Thank you for that tip! Could you advice me some good info sources where I can find explanations of techniques for detecting processing which is past it's deadline? – Eugene Alexeev Sep 13 '19 at 15:58
  • I'm not aware of really good sources on how to write real-time code. I've been doing it since 1993, and there are just a lot things you learn. I'm certain there are better paths than mine. – Rob Napier Sep 13 '19 at 18:01