I need to find the location of an image that the user provides within an image that I provide.
It is safe to assume at the time of the analysis that the user provided image is certain to be contained within the image to be compared with.
I’ve looked through and even have some experience with Core ML and Vision image classification however I am struggling to convince myself that it is the correct way to approach this problem. I feel like the way “feature values” is handled in Vision it is almost the reverse of what I’m looking for.
My question: Is there a feature of Core ML or Vision that tackles this particular problem head on?
Other information that may be needed;
It is not safe to assume that images provided are pixel to pixel perfect due to possible resolution differences.
They may also be provided in any shape although possible to crop to a standardised shape before analysis.
Rotation will also need to be accounted for.
There would not be cases where the image is in the image twice.