Math evaluation from speech to text

Question

I'm using iOS Speech API.

I'm trying to do some math on output of speech-to-text framework.

There are several problems.. First of all, user can say something stupid, so we have to check if it is word, or math expression. So i thought that we could use something like :

if string.contains("×") || string.contains("+") || string.contains("-")

But it looks awful and what if user says for example "+2"..

Then I thought that, maybe we could check if input is Int, so I thought about something like:

guard let stringResult = result?.bestTranscription.formattedString else {
                return
        }

let convertedNumber = Int(stringResult)

if let stringResult = convertedNumber {
            print(stringResult)
            print("Everything works Fine")
        } else {
                print("nope..")                    
        }

and it keeps failing..

I tryed several more strange methods to solve this error handling, but I don't have any other ideas.

Input, as piece of startRecording Function:

if result != nil {
            guard let stringResult = result?.bestTranscription.formattedString else {
                return
            }
            self.inputLabel.text = stringResult

And calculate function that just looks totally wrong:

private func calculate(string: String) {
    if string.contains("×") || string.contains("+") || string.contains("-") {
        let stringToCalculate = string.replacingOccurrences(of: "×", with: "*")
        guard let finalScore = NSExpression(format: stringToCalculate).expressionValue(with: nil, context: nil) else { return }
        outputLabel.text = String(describing: finalScore)
    } else {
        outputLabel.text = "Are you sure it's mathematical evaluation ?"
    }
}

Good input Example : 7 + 7 x 2 -> 21

Bad: "Pirates are drinking rum!" or "Pirates"-> outputLabel.text = "Are you sure that's mathematical evaluation?"

So the question is how should I handle errors to get know that input is mathematical evaluation and do math on it, and not the word/words?

Related https://stackoverflow.com/questions/44253194/parse-speech-output-to-a-json-to-call-application-api/44257741#44257741 — Nikolay Shmyrev, Nov 03 '17 at 13:14

score 1 · Accepted Answer · answered Jul 17 '17 at 15:42

Natural language parsing is a complex task, of course it can be done with simple substring matching or even with regex, but these days there are much more advanced algorithms which use machine learning to classify much more complex situations. Such systems are exemplar-based, that means you can submit them examples and they will learn properly identify intent from them. Such exemplar-based systems could parse things like "multiply the result by three" and understand that the "result" is the previous result and you need to multiply it. They also provide you the parsing confidence

For example of such tool you can check RASA NLU based on SPACY and MITIE, there are also services like LIUS from Microsoft.

It is not easy to run such tools on iOS, you might want to run them on a server over REST API. But compiling MITIE is theoretically possible.

Math evaluation from speech to text

1 Answers1