I have many png files like this:
I want to slice the image into 48 (=6x8) small image files for the 48 cells separated by the table borders. That is, I would like to have files img11.png
, ..., img68.png
, where img11.png
contains the (1,1) "1.4x4x8" cell, img12.png
the (1,2) "M/T" cell, img13.png
the "550,000" cell, ..., img68.png
the bottom right "641,500" cell.
I want to do it because I thought it would improve the performance of tesseract
, which is not satisfactory because many of my image files have much poorer quality than shown above. Also, margins and sizes are diverse, and some images contain non-English characters and images.
Would there be software packages to detect the table borders and slice the image into m x n images? I am new in this area. I have read How to find table like structure in image but it's way beyond my ability. I am willing to learn, though.
Thanks for your help.