1

I want to process images before I send them to Tesseract for OCR.

For example:

  • Resize the image
  • Change the resolution to 300 dpi
  • Threshold (B&W image)
  • Sharpen image

How can I automate this process?

Nishant Roy
  • 1,043
  • 4
  • 16
  • 35
  • Can it be done in GIMP? Yes - but them, maybe a more apropriate library to automate this is Leptonica or VIPS, or even GEGL - both have Python bindings - and that should be your language of choice (even if you choose GIMP, Python-fu would be better than script-fu unless you already know scheme) – jsbueno May 21 '15 at 13:01
  • How do I write a script for GIMP in Python-fu? – Nishant Roy May 22 '15 at 09:44

1 Answers1

4

I've just put together an answer (https://graphicdesign.stackexchange.com/questions/53919/editing-several-hundred-images-gimp/53965#53965 ) on graphicdesign, which is intended as an GIMP automation primer for people with no programing skills - it should be nice for understanding Python-fu as well.

On the very same answer, there are links to the official documentation, and one example of how to create a small script. You should them brose GIMP's PDB to findout about the exact proceeds you want.

But, all in all, you can create a Python file like this:

from gimpfu import *
import glob

def auto():
    for filename in glob(source_folder  + "/*.png"):
        img = pdb.gimp_file_load(source_folder + filename, source_folder + filename)
        # place the PDB calls to draw on the image before your interation here

        #disp = pdb.gimp_display_new(img)

        pdb.gimp_image_merge_visible_layers(img, CLIP_TO_IMAGE)
        pdb.gimp_file_save(img, img.layers[0], dest_folder + filename, dest_folder + filename)
        # pdb.gimp_display_delete(disp)
        pdb.gimp_image_delete(img)  # drops the image from gimp memory


register("batch_process_for_blah",
         "<short dexcription >Batch Process for Bla",
         "<Extended description text>",
         "author name",
         "license text",
         "copyright note",
         "menu label for plug-in",
         "", # image types for which the plug-in apply - "*" for all, blank for plug-in that opens image itself
         [(PF_DIRNAME, "source_folder", "Source Folder", None),
          (PF_DIRNAME, "dest_folder", "Dest Folder", None)], # input parameters - 
         [], # output parameters
         menu="<Image>/File", # location of the entry on the menus
         )
main()

To find the wanted operations inside the for loop, go to Help->Procedure Browser - or better yet, Filters->Python->Console and hit Browse - it is almost the same, but with an "apply" button that makes it easy to test the call, and copy it over to your plug-in code.

Community
  • 1
  • 1
jsbueno
  • 99,910
  • 10
  • 151
  • 209