0

How can I force pdftohtml output to be UTF 8?

$ pdftohtml -enc utf8 my.pdf 
Error: Couldn't find unicodeMap file for the 'utf-8' encoding

And -listenc doesn't seem to be a valid option.

I think it is using ISO-8859-1 by default (although for some reason VIM reads the file and special characters fine even though :set enc? reports utf-8)

theonlygusti
  • 11,032
  • 11
  • 64
  • 119

1 Answers1

1

Please run the command by using pdftohtml -enc UTF-8 file.pdf Like:

$ pdftohtml -enc UTF-8 my.pdf
Reynaldo Aceves
  • 436
  • 2
  • 10