How to extract image's numbers using IBMCloud Visual Recognition Text?

Question

I'm following this tutorial: https://cloud.ibm.com/docs/services/visual-recognition?topic=visual-recognition-tutorial-recognize-text&locale=en-US#pr-ximos-passos

My goal is read a document and made a table of content. The content is of type KEY - VALUE, like "VALUE 10.00". I can extract text of image but I can't extract the numbers.

Contextualizing the problem:

I'm using this image

Values that must extracted:

DATA                        13/06/2016
AGENCIA/CASH                0180/2009
VALOR DEPOSITO EM DINHEIRO  50.00

But when I using the follow curl call to Visual Recognition service:

curl -u "apikey:{API_KEY}" --form "images_file=@teste1.png" "https://gateway.watsonplatform.net/visual-recognition/api/v3/recognize_text?version=2018-03-19" -k

Result (a piece):

        "text": "data gigolo hora\nman/em 251\nnumero envelope 689 574\nvalor depusitd eh 4\ncpf no defusnantez 614 220\ndata lananzmnz",
        "words": [
            {
                "word": "data",
                "location": {
                    "height": 18,
                    "width": 40,
                    "left": 13,
                    "top": 10
                },
                "score": 0.6098,
                "line_number": 0
            },
            {
                "word": "gigolo",
                "location": {
                    "height": 43,
                    "width": 57,
                    "left": 146,
                    "top": 0
                },
                "score": 0.4283,
                "line_number": 0
            },
            {
                "word": "hora",
                "location": {
                    "height": 18,
                    "width": 39,
                    "left": 249,
                    "top": 11
                },
                "score": 0.6533,
                "line_number": 0
            },
            {
                "word": "man/em",
                "location": {
                    "height": 17,
                    "width": 72,
                    "left": 127,
                    "top": 35
                },
                "score": 0.8187,
                "line_number": 1
            },
            {
                "word": "251",
                "location": {
                    "height": 21,
                    "width": 30,
                    "left": 294,
                    "top": 33
                },
                "score": 0.9881,
                "line_number": 1
            },
            {
                "word": "numero",
                "location": {
                    "height": 21,
                    "width": 54,
                    "left": 12,
                    "top": 52
                },
                "score": 0.9116,
                "line_number": 2
            },

Note, that some words is good extracted, but the numbers not, my main goal is extract monetary values and dates.

To create my table I can use the "height" property to know which is your respective numeric value.

So, how I extract the numbers?

PS.: This is a Portuguese(BR) document.

score 1 · Answer 1 · answered Jul 18 '19 at 13:22

1

Thank you for your interest in the service... however, as available today, this beta service is mostly trained on an English language dictionary. Although it can recognize short numeric strings, it will not do particularly well on tasks like reading arbitrary numbers such as prices, serial numbers or license plates. Also the Brazilian Portuguese words will probably not be found.

answered Jul 18 '19 at 13:22

Matt Hill

1,081
6
4

There are a way to train or improve to recognize theses number kinds? – Augusto Jul 18 '19 at 14:47
unfortunately, no, this part of the service is not trainable. You can, however, train a custom classifier to produce labels applicable to the whole image, for example, to tell printed receipts apart from handwritten things. – Matt Hill Jul 19 '19 at 15:20

How to extract image's numbers using IBMCloud Visual Recognition Text?

1 Answers1