0

I want to read tiff file. And I save txt each .png files which is in tiff file. If I use below code, I cannot save each page with its name. How can I do ? (Cpp code)

// Open input image with leptonica library
Pix *image = pixRead("/usr/src/tesseract-3.02/phototest.tif");
api->SetImage(image);
// Get OCR result
char *outText;
outText = api->GetUTF8Text();
  • What is the problem? Your code looks like this basic example: https://code.google.com/p/tesseract-ocr/wiki/APIExample – tgmath Mar 27 '14 at 15:10
  • This code get all ocr result, But I want to get different text per page –  Mar 27 '14 at 15:31
  • I want to like this : Page 1 text : --------------------------------Page 2 Text : ------------------------------- –  Mar 27 '14 at 15:32

1 Answers1

0

According to the Leptonica API there is a special function pixReadTiff which reads a certain page from your tif file as Pix.

PIX *pixReadTiff(const char  *filename, l_int32 n)

It returns NULL or an error if the page does not exists. Just iterate through all pages.

To get the number of pages, you can use this function:

 l_int32 tiffGetCount(FILE *fp, l_int32  *pn)

For other details you might want to look into the API yourself. You might look into this: http://tpgit.github.io/Leptonica/tiffio_8c_source.html

Anthony Raymond
  • 7,434
  • 6
  • 42
  • 59
tgmath
  • 12,813
  • 2
  • 16
  • 24
  • ok. pixReadTiff(filename, page_number) , how can I know page_number ? –  Mar 27 '14 at 16:06