1

I have been trying to get started with CoreML (Apple's Machine Learning Library). I am following these tutorials to get started

1) https://www.appcoda.com/coreml-introduction/

2) https://www.raywenderlich.com/164213/coreml-and-vision-machine-learning-in-ios-11-tutorial

The first tutorial uses a Inception V3 and the second tutorial uses Places205-GoogLeNet Model for the explanation.

After all the basic setting up steps

The Places205-GoogLeNet tutorial uses the following code

func detectScene(image: CIImage) {
    answerLabel.text = "detecting scene..."

    // Load the ML model through its generated class
    guard let model = try? VNCoreMLModel(for: GoogLeNetPlaces().model) else {
      fatalError("can't load Places ML model")
    }
  }

and the second code uses this

guard let prediction = try? model.prediction(image: pixelBuffer!) else {
    return
}

What is the difference between these two approaches and which one is more recommended as both the types can be used to pass a pixel buffer and show result ?

Andy Jazz
  • 49,178
  • 17
  • 136
  • 220
Teja C
  • 11
  • 1
  • 5

2 Answers2

3

In the first tutorial you need to take care of resizing images to 299×299 by yourself. In the second tutorial they using Vision framework that will do it automatically. I think Vision approach is cleaner

Vitaly Migunov
  • 4,297
  • 2
  • 19
  • 23
  • There's a post [here](http://deepdojo.com/mlmodel-api) on how the API between Core ML and the Vision framework relate to each other. – otto Oct 27 '17 at 16:12
1

Vision framework is a set of tools which help you to set up all the image processing pipeline. Among these tools is CoreML with a model provided by you, but it's not limited only to machine learning. Vision helps you to: preprocess, rescale, crop images, detect rectangles, barcodes, faces and much more. Check the docs for more information. Apart from work performed directly on the images it also helps to perform requests to your model, which is very important if you have complicated sequences of requests or want to concatenate those operations with some other processing.

With pure CoreML, you would have to implement all these functionalities by yourself because CoreML responsibility is only setting up your model and get simple API over it.

Vision is not a pure wrapper around CoreML because it does much more than request performing and initializing models, but it uses CoreML for some of its functionalities (to be specific - in VNCoreMLRequest).

In the links provided by you: 1st (appcoda) is about pure CoreML, 2nd (Ray) is about Vision.

Wladek Surala
  • 2,590
  • 1
  • 26
  • 31