I'm working on an Android app (though eventually I'll want to do the same thing on iOS) and I'm looking to build an image recognition feature into it. The user would snap a picture, then this component of the app would need to figure out what that image is, whether it's a bowling ball, a salad, a book, you name it. It would also be helpful if it could figure out roughly how big the object in question is, though I imagine the camera focus values could help with that. The objects in question would not be moving.
I've heard of neural networks being used, but I'm not sure how this could be implemented, especially since I want to be able to recognize a very wide range of objects. I highly doubt this sort of processing could happen natively on a phone either. What are some solutions to this problem?