I am trying to combine CoreML and ARKit in my project using the given inceptionV3 model on Apple website.
I am starting from the standard template for ARKit (Xcode 9 beta 3)
Instead of intanciating a new camera session, I reuse the session that has been started by the ARSCNView.
At the end of my viewDelegate, I write:
sceneView.session.delegate = self
I then extend my viewController to conform to the ARSessionDelegate protocol (optional protocol)
// MARK: ARSessionDelegate
extension ViewController: ARSessionDelegate {
func session(_ session: ARSession, didUpdate frame: ARFrame) {
do {
let prediction = try self.model.prediction(image: frame.capturedImage)
DispatchQueue.main.async {
if let prob = prediction.classLabelProbs[prediction.classLabel] {
self.textLabel.text = "\(prediction.classLabel) \(String(describing: prob))"
}
}
}
catch let error as NSError {
print("Unexpected error ocurred: \(error.localizedDescription).")
}
}
}
At first I tried that code, but then noticed that inception requires a pixel Buffer of type Image. < RGB,<299,299>.
Although not recommenced, I thought I would just resize my frame then try to get a prediction out of it. I am resizing using this function (took it from https://github.com/yulingtianxia/Core-ML-Sample)
func resize(pixelBuffer: CVPixelBuffer) -> CVPixelBuffer? {
let imageSide = 299
var ciImage = CIImage(cvPixelBuffer: pixelBuffer, options: nil)
let transform = CGAffineTransform(scaleX: CGFloat(imageSide) / CGFloat(CVPixelBufferGetWidth(pixelBuffer)), y: CGFloat(imageSide) / CGFloat(CVPixelBufferGetHeight(pixelBuffer)))
ciImage = ciImage.transformed(by: transform).cropped(to: CGRect(x: 0, y: 0, width: imageSide, height: imageSide))
let ciContext = CIContext()
var resizeBuffer: CVPixelBuffer?
CVPixelBufferCreate(kCFAllocatorDefault, imageSide, imageSide, CVPixelBufferGetPixelFormatType(pixelBuffer), nil, &resizeBuffer)
ciContext.render(ciImage, to: resizeBuffer!)
return resizeBuffer
}
Unfortunately, this is not enough to make it work. This is the error that is catched:
Unexpected error ocurred: Input image feature image does not match model description.
2017-07-20 AR+MLPhotoDuplicatePrediction[928:298214] [core]
Error Domain=com.apple.CoreML Code=1
"Input image feature image does not match model description"
UserInfo={NSLocalizedDescription=Input image feature image does not match model description,
NSUnderlyingError=0x1c4a49fc0 {Error Domain=com.apple.CoreML Code=1
"Image is not expected type 32-BGRA or 32-ARGB, instead is Unsupported (875704422)"
UserInfo={NSLocalizedDescription=Image is not expected type 32-BGRA or 32-ARGB, instead is Unsupported (875704422)}}}
Not sure what I can do from here.
If there is any better suggestion to combine both, I'm all ears.
Edit: I also tried the resizePixelBuffer method from the YOLO-CoreML-MPSNNGraph suggested by @dfd , the error is exactly the same.
Edit2: So I changed the pixel format to be kCVPixelFormatType_32BGRA (not the same format as the pixelBuffer passed in the resizePixelBuffer).
let pixelFormat = kCVPixelFormatType_32BGRA // line 48
I do not have the error anymore. But as soon as I try to make a prediction, the AVCaptureSession stops. Seems I am running into the same issue Enric_SA is running on the apple developers forum.
Edit3: So I tried implementing rickster solution. Works well with inceptionV3. I wanted to try a a feature observation (VNClassificationObservation). At this time, it is not working using TinyYolo. The bounding are wrong. Trying to figure it out.