Questions tagged [hocr]

hOCR is an open standard which defines a data format for representation of OCR output.

hOCR is an open standard which defines a data format for representation of OCR output. The standard aims to embed layout, recognition confidence, style and other information into the recognized text itself. Embedding this data into text in the standard HTML format is used to achieve that goal.

Public Specification for the hOCR Format

31 questions
0
votes
1 answer

c# generate hocr file using charlesw tesseract

how can i generate hocr using the tesseract wrapper here currently i need to dynamically add the location of the tessdata to the environment variables and run my code System.Diagnostics.Process pProcess = new System.Diagnostics.Process(); …
classname13
  • 123
  • 3
  • 13
1 2
3