0

I need to display html formatted pdfs in the browser and select, copy and edit the formatted html.

So far I tried the pdftohtml command line utility and the pdf.js platform from mozilla. I just can't seem to do both tasks with one utility very well. For instance, I can display the pdf extremly accurate with pdf.js but I can't generate a formatted text layer, the overlay that is created for selecting text only has positioning styles but no font styles.

What would be the best approach to solve this problem?

SparklingWater
  • 358
  • 4
  • 15
  • There is also an SVG backend for PDF.js. Converting to formatted HTML involves loosing exact characters positions and it's not trivial. The best approach will be to contribute to PDF.js project. – async5 Apr 20 '17 at 16:44
  • I guess you are right. It might be easier to extract stylings from the svg. I will try it out. Thanks for the tip! – SparklingWater Apr 20 '17 at 17:22

0 Answers0