0

Generated a 72 dpi image and XML with zoom as 1 from this PDF.

Although the DPI was 72, to be able to make the conversion of co-ordinates in the XML to pixel possible had to iteratively tweak the DPI using this sheet. 90.5 seems to work well. However, this does not look like the proper way to do it.

Command to generate XML: pdftohtml -xml -zoom 1 -fontfullname -s -c input.pdf output

Command to generate Image: pdftoppm -jpeg -r 72 input.pdf output

Note: 72 dpi was used when generating the image because in 72 dpi the image being output was of similar dimensions to the PDF and the XML output.

This conversion is essential because this will allow the building of the HTML. I am aware that poppler itself can generate the HTML, however, since the generated HTML needs to be made email compatible the XML is being used to build the HTML from scratch.

In what ways can the conversion of the co-ordinates in the XML to PDF be done more reliably?

qwertynik
  • 118
  • 2
  • 10
  • @KJ Sure, will add the commands to the description. The first input in this process is the PDF linked already. And the PDF is generated from a PSD over which, unfortunately, I do not have control. Let me know if adding any other additional info would help. – qwertynik Sep 28 '21 at 13:28
  • @KJ Yes, I can see the squishing of the text. Appears strange. The input source is a PSD file. Knowing the individuals who pass on the input, I know that the PDF is exported from the PSD. Sometimes online tool like photopea.com is used for the export. – qwertynik Sep 28 '21 at 14:18
  • @KJ So, can the vector displacement denoted in the XML as top/left be taken as pixels? Just a while ago, figured that the tool I was using to measure the pixel distance on the image had issues. Started using paint.net and the values match. A text with the top of 19px in the image (72dpi) has the top value as 19 in the XML too. Left values also match. So is the unit of measurement in the XML generated by poppler, pixels? Also what about the font sizes? – qwertynik Sep 28 '21 at 14:53
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/237623/discussion-between-qwertynik-and-k-j). Doing this as per the suggestion by SO. – qwertynik Sep 29 '21 at 06:23

0 Answers0