API to retrieve images from within an image or pdf

Question

I am looking for a way to extract images from within another image. For example:

Here is a picture taken of a paper. It includes text, an image of a camera, and an image of a qr code. Is there an API that can possibly extract those two(camera and qr code) from this larger image and separate them into their own individual images. I know this is doable with the text(OCR), but I need to find some way to do Image Recognition if that even exists. For now, I cant find any reference to doing this besides extracting images from pdf's, which none of those softwares have the capability to extract them from a non-perfect pdf.

Price for the API(node.js prefered, but i can adapt to use any language) is not a big concern, I'm just not sure this is even possible to due without programming a legitable artificial intelligence using machine learning, which I would no doubt cause a global internet shutdown from breaking everything if I attempted to do so.

Anyway, any suggestions would be great and much appreciated. Thanks!

EDIT: the images aren't always those, it can be an image of anything, from potatoes to flags

Adobe Acrobat does this perfectly - you just click Edit PDF and it OCRs pictures and even fonts. — supsayan, Nov 11 '22 at 20:32
Supsayan, thanks for the suggestion. Would adobe work on images converted to pdfs though? So essentially blurry pdf's that are slanted and imperfect. From all the other pdf image extractors I tested, none of them could achieve the task. I haven't tested adobe yet, so ill try that. — blueberrr, Nov 11 '22 at 20:40
Ah, alright. I'll not do this again. One last thing: Do you have any reccomendations on a site I can ask for reccomendations? — blueberrr, Nov 12 '22 at 00:03

score 1 · Answer 1 · answered Nov 11 '22 at 20:34

1

For the QR code, you can simply use a QR code scanner library and convert the output back into a QR code. As for the camera, you are going to need an image recognition service like Google Cloud Vision or train your own neural network with something like TensorFlow to recognize pictures of cameras.

answered Nov 11 '22 at 20:34

Arda

47
1
6

Thank you for your suggestion. I didn't know google cloud vision could also recognize images. I'll look into that. – blueberrr Nov 11 '22 at 20:39
No problem. By the way, if the only thing your image contains will be some text, a camera, and a QR code you might eliminate the text and the QR code, leaving you with only the camera. This way, you won't have to deal with extracting the camera alone. – Arda Nov 11 '22 at 20:45
Hmmm, didn't think of that. Is this also a feature in cloud vision? – blueberrr Nov 11 '22 at 21:33
I apologize, I am unable to locate where in the API I am capable of detecting and isolating images. The only function available(maybe its in a category here that I am just unable it can do it) are: label detection, text detection, saf search, facial detection, celebrity detection, landmark detection, logo detection, image properties, crop hints, web detection, and object localization. – blueberrr Nov 11 '22 at 21:47

K J · Answer 2 · 2022-11-13T00:32:10.553

0

QR detectors abound around the web and some are on github but for single objects you could try hotpot API https://hotpot.ai/docs/api your code example linked into https://hotpot.ai/remove-background

for striping back you may need a secondary autocrop task

edited Nov 13 '22 at 00:32

answered Nov 12 '22 at 20:33

K J

8,045
3
14
36

API to retrieve images from within an image or pdf

2 Answers2