0

I use following command tansform pdf to html. then I got croedump file. ./pdf2htmlEX --zoom 1 --dest-dir ./pdf_test --optimize-text 1 --zoom 1.4 --process-outline 0 --embed-image 0 --font-format ttf pdf_test/020616320411_2.pdf

[coredump message is here] (https://i.stack.imgur.com/RuL9N.png)

I tried many methods and compare the diffent pdf then got these conclusion:

  1. The coredump pdf contains an "arial unicode ms" font. I guess that's probably the reason。
  2. I created an test pdf using "arial unicode ms" and the same content, pdf2htmlex worked ok.
  3. The test pdf and the coredumped pdf has the same content but diffent filesize. test pdf file size: 502,032 coredump pdf file size: 445,916

0 Answers0