1

I'm trying to work with Xcode CoreML to classify images that are simply single digits or letters. To start out with I'm just usiing .png images of digits. Using Create ML tool, I built an image classifier (NOT including any Vision support stuff) and provided a set of about 300 training images and separate set of 50 testing images. When I run this model, it trains and tests successfully and generates a model. Still within the tool I access the model and feed it another set of 100 images to classify. It works properly, identifying 98 of them corrrectly.

Then I created a Swift sample program to access the model (from the Mac OS X single view template); it's set up to accept a dropped image file and then access the model's prediction method and print the result. The problem is that the model expects an object of type CVPixelBuffer and I'm not sure how to properly create this from NSImage. I found some reference code and incorported but when I actually drag my classification images to the app it's only about 50% accurate. So I'm wondering if anyone has any experience with this type of model. It would be nice if there were a way to look at the "Create ML" source code to see how it processes a dropped image when predicting from the model.

The code for processing the image and invoking model prediction method is:

// initialize the model
mlModel2 = MLSample()   //MLSample is model generated by ML Create tool and imported to project
// prediction logic for the image
// (included in a func)
//
let fimage = NSImage.init(contentsOfFile: fname)    //fname is obtained from dropped file
do {
    let fcgImage = fimage.cgImage(forProposedRect: nil, context: nil, hints: nil)
    let imageConstraint = mlModel2?.model.modelDescription.inputDescriptionsByName["image"]?.imageConstraint
    let featureValue = try MLFeatureValue(cgImage: fcgImage!, constraint: imageConstraint!, options: nil)
    let pxbuf = featureValue.imageBufferValue
    let mro = try mlModel2?.prediction(image: pxbuf!)
    if mro != nil {
           let mroLbl = mro!.classLabel
           let mroProb = mro!.classLabelProbs[mroLbl] ?? 0.0
           print(String.init(format: "M2 MLFeature: %@ %5.2f", mroLbl, mroProb))
           return
    }
}
catch {
     print(error.localizedDescription)
}
return
Andy Jazz
  • 49,178
  • 17
  • 136
  • 220
user2132980
  • 195
  • 2
  • 10

1 Answers1

0

There are several ways to do this.

The easiest is what you're already doing: create an MLFeatureValue from the CGImage object.

My repo CoreMLHelpers has a different way to convert CGImage to CVPixelBuffer.

A third way is to get Xcode 12 (currently in beta). The automatically-generated class now accepts images instead of just CVPixelBuffer.

In cases like this it's useful to look at the image that Core ML actually sees. You can use the CheckInputImage project from https://github.com/hollance/coreml-survival-guide to verify this (it's an iOS project but easy enough to port to the Mac).

If the input image is correct, and you still get the wrong predictions, then probably the image preprocessing options on the model are wrong. For more info: https://machinethink.net/blog/help-core-ml-gives-wrong-output/

Matthijs Hollemans
  • 7,706
  • 2
  • 16
  • 23