0

I have generated a LSTM model for audio classification using keras with tf as the backend. Upon conversion to a .mlmodel using coremltools I am running into issues as you can see here. The dimensions are very different from what is expected.

I used this for my base in xcode in swift.

Particularly this snip is what I believe is giving me the trouble:

do {
    let request = try SNClassifySoundRequest(mlModel: soundClassifier.model)
    try analyzer.add(request, withObserver: resultsObserver)
        } catch {
                print("Unable to prepare request: \(error.localizedDescription)")
                return
                }
   }

Running this model gives me the following error:

Invalid model, inputDescriptions.count = 5

Unable to prepare request: Invalid model, inputDescriptions.count = 5

Even though when I build the model I see what is expected in the spec:

description {
  input {
    name: "audioSamples"
    shortDescription: "Audio from microphone"
    type {
      multiArrayType {
        shape: 13
        dataType: DOUBLE
      }
    }
  }

I am trying to incorporate this post into my code but I am not sure how to format it to my needs. Any advice is greatly appreciated. I can see that MLMultiArray is the key to my question, but I am unsure of: how to put the proper data into it and how to push this into a SNClassifySoundRequest type.

keras == 2.3.1 coremltools == 3.3

1 Answers1

0

When you use SNClassifySoundRequest, your model needs to have a certain structure. I don't know the exact details off the top of my head, but I think it needs to be a pipeline where the first model is a built-in model that converts the audio to spectrograms.

If you trained your model with Keras, it's most likely not compatible with the requirements of SNClassifySoundRequest.

The good news is that you don't need SNClassifySoundRequest to run your model. Simply call soundClassifier.prediction(...) on the model.

Note that you need to pass in the input but also the hidden states of the LSTM layers. Core ML will not automatically manage the LSTM state for you (unlike Keras).

Matthijs Hollemans
  • 7,706
  • 2
  • 16
  • 23
  • Thank you, you pointed me in the right direction for sure. I am using [this](https://apple.github.io/turicreate/docs/userguide/sound_classifier/export-coreml.html) to help as well and it seems to be good to go but I am having an issue setting up the model input. I know I feed the input, 'audioSamples' the actual audio data but what do I pass the other LSTM layers? PS. I also got your book, great info! – Joshua Ball Mar 16 '20 at 05:47
  • `guard let modelOutput = try? self.model.prediction(audioSamples: audioData, lstm_1_h_in: <#T##MLMultiArray?#>, lstm_1_c_in: <#T##MLMultiArray?#>, lstm_2_h_in: <#T##MLMultiArray?#>, lstm_2_c_in: <#T##MLMultiArray?#>) else { fatalError("Error calling predict") }` – Joshua Ball Mar 16 '20 at 05:50
  • Furthermore the exact error I am getting is "[coreml] Failure verifying inputs." This also occurs in the modelInput stage like: `let modelInput = classyInput(audioSamples: audioData, lstm_1_h_in: <#T##MLMultiArray?#>, lstm_1_c_in: <#T##MLMultiArray?#>, lstm_2_h_in: <#T##MLMultiArray?#>, lstm_2_c_in: <#T##MLMultiArray?#>)` and I have the output as: `guard let modelOutput = try? self.model.prediction(input: modelInput) else{fatalError("Failure predicting model")}` – Joshua Ball Mar 16 '20 at 06:14
  • You'll need to create MLMultiArray objects of the correct shape (depends on the number of hidden units in your LSTMs) and fill it with the initial state. Typically this will be zeros or random numbers. But note that a new MLMultiArray will have crap in it, i.e. it is uninitialized memory, so you'll need to overwrite it yourself with zeros or random numbers first. You can also pass in nil. – Matthijs Hollemans Mar 16 '20 at 08:41