5

I am working on a Ruby on Rails application to extract text and images from PDF files. While extracting images few of them get corrupted.

Is there any way to identify those corrupted images after extraction? Anyone know why they get corrupted?

I am using pdftohtml and pdftotext (poppler) Ubuntu utilities.

Thanks in advance.

forchetan01
  • 136
  • 10
sam
  • 372
  • 2
  • 12

0 Answers0