I use following command tansform pdf to html. then I got croedump file.
./pdf2htmlEX --zoom 1 --dest-dir ./pdf_test --optimize-text 1 --zoom 1.4 --process-outline 0 --embed-image 0 --font-format ttf pdf_test/020616320411_2.pdf
[coredump message is here] (https://i.stack.imgur.com/RuL9N.png)
I tried many methods and compare the diffent pdf then got these conclusion:
- The coredump pdf contains an "arial unicode ms" font. I guess that's probably the reason。
- I created an test pdf using "arial unicode ms" and the same content, pdf2htmlex worked ok.
- The test pdf and the coredumped pdf has the same content but diffent filesize. test pdf file size: 502,032 coredump pdf file size: 445,916