iOS SDK Realtime Text Removal on AVCatureSession Frame of Image in focus

Question

I am developing an iOS app similar to Word Lens . What I want is the functionality to add some blurred pieces as overlay on certain text part as shown in the attached images. If it is possible to wipe out the text data completely from an Image, I would also appreciate that. I searched a lot on internet and have found that it has something to do with OpenGL ES and OpenCV but do not exactly how to achieve this.

Any guidance would be highly appreciated !

Thanks Please have a look at image here which has blurred rectangles over an Image under AVCaptureSession focus enter image description here

Mehmet Emre Portakal · Answer 1 · 2014-01-25T07:58:56.370

There is a sdk called VUFORIA/QCAR. if you examine their Text Recognition Sample it will give you an idea.

For example;

firstly change open gl es shader program to render a square,

after that you need to detect what color behind the word. to do so. vuforia gives readonly access to image data.

like this;

QCAR::Frame vbFrame = state.getFrame();
const QCAR::Image *vbImage;
for (int i = 0; i<vbFrame.getNumImages(); i++) {
    if (state.getFrame().getImage(i)->getFormat() == QCAR::RGB888) {
        vbImage = state.getFrame().getImage(i);
    }
}

From that point you need vuforia experience and binary image knowledge.

for to give you and an idea, you can get a point RGBcolor values like this,

- (void) getColorFromVGImage:(int)xx andY:(int)yy
{
    const char* vbImageData = (const char*) vbImage->getPixels();

    int maxXx = vbImage->getWidth() - 1;
    int maxYy = vbImage->getHeight() - 1;

    int bytesPerPixel = 3;
    int bytesPerRow = vbImage->getStride();
    int byteIndex = ((bytesPerRow * yy) + (xx * bytesPerPixel));

    int maxByteIndex = (bytesPerRow * maxYy) + (maxXx * bytesPerPixel);

    if (maxByteIndex >= byteIndex && byteIndex > 0) {
        unsigned char rchar = vbImageData[byteIndex];
        unsigned char gchar = vbImageData[byteIndex + 1];
        unsigned char bchar = vbImageData[byteIndex + 2];

        int r = (int)rchar;
        int g = (int)gchar;
        int b = (int)bchar;
    }
}

AND PLEASE NOTE: This process for RGB888 binary image data.

hope this helped.

score 0 · Answer 2 · answered Dec 29 '13 at 15:16

Yes it's possible, and yes OpenCV and OpenGL would be good technology to use.

OpenCV is a cross-platform hardware accelerated image recognition library. You could use it to develop routines that would find text in a video feed and place boxes around it. The more context you have about what you are viewing, the faster and more reliable you can make it. (e.g. if you know you will be scanning a paper form with a fixed layout and boxes for user information, it's much easier than if you are scanning some arbitrary image that might have text in different fonts/different sizes and the layout can vary.)

However, both of those are very advanced frameworks that take weeks or months to become proficient with, even for experienced developers.

If you are a newbie, you are in over your head.

Hi @Duncan, I really appreciate your comment and liked it.It would be more helpful if you can throw some code level insight into it, like which OpenCV functions to be used and What part of OpenGl ES framework i should be focusing more.Anything that can give me a quick start for learning. — inkaas, Dec 29 '13 at 17:58

iOS SDK Realtime Text Removal on AVCatureSession Frame of Image in focus

2 Answers2