There are two options for passing images into CoreML models:
Is there any data on the memory and processing overhead associated with using the Vision framework vs. passing a CVPixelBuffer directly to CoreML?
Thoughts based on what I've seen while debugging:
Memory
Assuming we already have the data in a CVPixelBuffer, creating the CGImage to pass to Vision seems to double the memory usage. It looks like Vision is creating a new object in CoreVideo/CoreImage from createPixelBufferFromVNImageBuffer
which makes sense as it needs to create a copy of the image to crop/rotate/scale.
Processing
You're going to have to do the rotation and/or scaling either way. I'd assume Vision does those at least as efficiently as you could do by hand with Accelerate. So should not be any overhead here.