0

I'm trying to make an equivalent to the .NET recognize() call, which is synchronous, for ios in objective-c. I found code to recognize speech but the string that was recognized is only inside a block.

I've tried making the block not a block (it seems to be part of the API that it be a block), making __block variables and returning their values, also out parameters in the caller/declarer of the block; finally I wrote a file while in the block and read the file outside. It still didn't work like I want because of being asynchronous although I at least got some data out. I also tried writing to a global variable from inside the block and reading it outside.

I'm using code from here: How to implement speech-to-text via Speech framework, which is (before I mangled it):

/*!
 * @brief Starts listening and recognizing user input through the 
 * phone's microphone
 */

- (void)startListening {

    // Initialize the AVAudioEngine
    audioEngine = [[AVAudioEngine alloc] init];

    // Make sure there's not a recognition task already running
    if (recognitionTask) {
        [recognitionTask cancel];
        recognitionTask = nil;
    }

    // Starts an AVAudio Session
    NSError *error;
    AVAudioSession *audioSession = [AVAudioSession sharedInstance];
    [audioSession setCategory:AVAudioSessionCategoryRecord error:&error];
    [audioSession setActive:YES withOptions:AVAudioSessionSetActiveOptionNotifyOthersOnDeactivation error:&error];

    // Starts a recognition process, in the block it logs the input or stops the audio
    // process if there's an error.
    recognitionRequest = [[SFSpeechAudioBufferRecognitionRequest alloc] init];
    AVAudioInputNode *inputNode = audioEngine.inputNode;
    recognitionRequest.shouldReportPartialResults = YES;
    recognitionTask = [speechRecognizer recognitionTaskWithRequest:recognitionRequest resultHandler:^(SFSpeechRecognitionResult * _Nullable result, NSError * _Nullable error) {
        BOOL isFinal = NO;
        if (result) {
            // Whatever you say in the microphone after pressing the button should be being logged
            // in the console.
            NSLog(@"RESULT:%@",result.bestTranscription.formattedString);
            isFinal = !result.isFinal;
        }
        if (error) {
            [audioEngine stop];
            [inputNode removeTapOnBus:0];
            recognitionRequest = nil;
            recognitionTask = nil;
        }
    }];

    // Sets the recording format
    AVAudioFormat *recordingFormat = [inputNode outputFormatForBus:0];
    [inputNode installTapOnBus:0 bufferSize:1024 format:recordingFormat block:^(AVAudioPCMBuffer * _Nonnull buffer, AVAudioTime * _Nonnull when) {
        [recognitionRequest appendAudioPCMBuffer:buffer];
    }];

    // Starts the audio engine, i.e. it starts listening.
    [audioEngine prepare];
    [audioEngine startAndReturnError:&error];
    NSLog(@"Say Something, I'm listening"); 
}

I want to call Listen(), (like startListening() above), have it block execution until done, and have it return the string that was said. But actually I would be thrilled just to get result.bestTranscription.formattedString somehow to the caller of startListening().

rmaddy
  • 314,917
  • 42
  • 532
  • 579
Yrmlsddg
  • 3
  • 2
  • Why don't you just use the result inside the block? – ovo Aug 06 '19 at 06:56
  • It's generally a good idea to get to know the idioms of the technology you are working with instead of trying to translate from another tech stack. ObjC and Cocoa are very different from C# and Net. – dandan78 Aug 06 '19 at 08:43

1 Answers1

1

I'd recommend you to take another approach. In Objective-C having a function that blocks for a long period of time is an anti-pattern.

In this language there's no async/await, nor cooperative multitasking, so blocking for long-ish periods of time might lead to resource leaks and deadlocks. Moreover if done on the main thread (where the app UI runs), the app might be forcefully killed by the system due to being non-responsive.

You should use some asynchronous patterns such as delegates or callbacks.

You might also try using some promises library to linearize your code a bit, and make it look "sequential".

The easiest approach with callbacks would be to pass a completion block to your "recognize" function and call it with the result string when it finishes:

- (void)recognizeWithCompletion:(void (^)(NSString *resultString, NSError *error))completion {
    ...
    recognitionTask = [speechRecognizer recognitionTaskWithRequest:recognitionRequest 
            resultHandler:^(SFSpeechRecognitionResult *result, NSError *error)
    {
        ...
        dispatch_async(dispatch_get_main_queue(), ^{
            completion(result.bestTranscription.formattedString, error);
        });
        ...
    }];
    ...
}

Note that the 2nd parameter (NSError) - is an error in case the caller wants to react on that too.

Caller side of this:

// client side - add this to your UI code somewhere
__weak typeof(self) weakSelf = self;
[self recognizeWithCompletion:^(NSString *resultString, NSError *error) {
    if (!error) {
        [weakSelf processCommand:resultString];
    }
}];

// separate method
- (void)processCommand:(NSString *command) {
    // can do your processing here based on the text
    ...
}
battlmonstr
  • 5,841
  • 1
  • 23
  • 33
  • Thanks, I'll try this tonight. – Yrmlsddg Aug 06 '19 at 12:23
  • Can you explain how that call to completion() has access to result.bestTranscription when result is a variable of the block declared on the line where recognitionTask is assigned? I think if I could legally/validly make the call between your ...'s my problem would be solved. I might not have understood. – Yrmlsddg Aug 07 '19 at 02:24
  • It has to be called from the resultHandler block. I have updated the code to make it more clear. – battlmonstr Aug 07 '19 at 12:59
  • Now I will have the result in the completion block instead of the resultHandler block. Can I write this to a member variable of the class containing recognize()? Or am I still restricted from affecting things because I am in a block? I need to pass the spoken string to my main thread to act on the command that was spoken. In C# I have a flag saying the string is valid which the consumer clears and producer sets; do I need to change approach for objective-c? – Yrmlsddg Aug 08 '19 at 00:00
  • Added some more code to make it more clear. Inside processCommand you can do whatever processing you want ("act on the command"). You can have a flag or an additional variable there in your class if you want to. Also I have added dispatch_async to make sure that it gets processed in the UI thread if you want to do some UI updates. – battlmonstr Aug 08 '19 at 12:16