0

I have encountered a problem while setting up the font properties file to train tesseract v 3.01 ocr engine. according to the 3.01v you are required to setup a font properties file. The format of the font_properties file is such that

and 0 or 1 flags must be used to indicate the properties. does any one know what fixed, serif or fraktur means?

and when I run it with my font_properties file it throws the following errorenter image description here]![enter image description here

Thank you

Mr.Noob
  • 1,005
  • 3
  • 24
  • 58

3 Answers3

1

Fixed (or monospaced), Serif, and Fraktur are standard font descriptors - you can look up what they mean on Wikipedia.

Regarding your error, ensure you have formatted your font_properties file properly correctly, as outlined in the Training Tesseract 3 tutorial below. If you're only training one font, the file should contain one line, in your case

times_new_roman 0 0 0 1 0

You haven't included what you've put in your font_properties file, but note that your font name should not have spaces!

http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3

1

No input files to Tesseract training should have spaces in their names.

The entry in font_properties should match the fontname part of the name of the image file; e.g., if font_properties has uknumberplate, then the filename of your image should be eng.uknumberplate.exp0.tif.

nguyenq
  • 8,212
  • 1
  • 16
  • 16
  • ok but can you tell me what the filesnames should be for the font_properties? imagefile? and boxfile. im totally confused now :( thanks – Mr.Noob Jul 25 '12 at 15:53
  • Just follow closely the Tesseract Training Wiki (http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3). The fontname should be the same or close to the name of the font. For example, for Times New Roman italic, name for image would be eng.timesi.exp0.tif. The fontname part of the box file and the entry in font_properties should match that of the image, e.g.: eng.timesi.exp0.box and timesi 0 0 0 1 0, respectively. – nguyenq Jul 25 '12 at 23:31
  • Do you have an idea on this?http://stackoverflow.com/questions/11674288/what-files-should-be-included-in-the-tessdata-folder-after-training-tesseract – Mr.Noob Jul 26 '12 at 17:38
0

you have to put font_properties.txt in the command, but then an exception is thrown in windows, but it finds the font properties file.

Tawfiq Chowdhury
  • 29
  • 1
  • 2
  • 8