we are developing a website that needs to convert PDF files into HTML because some of the PDF has a form (not necessarily fillable PDF, these PDFs are printed to be filled up).
So we want it to be filled up through our website instead of printing the files and filled up by pen. We are going paperless.
DocuSign provides these wherein you can upload PDF, then you can customized it to have textboxes, checkbox. So we're kinda using DocuSign as a reference but still haven't figured out how they did it (Almost perfect convertion of PDF to HTML vice-versa).
So far I've tried several 3rd party softwares for converting PDF to HTML. I've tried XPDF, Poppler, & ImageMagick.
ImageMagick converts a PDF to an image which is not suitable as these images has a large size when converted back to a PDF for printing.
Poppler is a fork XPDF based on my research, I've tried it after using XPDF to see if it's better, it basically does what XPDF do but it converts the PDF to have bigger pixels on the CSS when converted to HTML. That's fine but it loses the font family.
XPDF converts PDF to HTML but the pixel is smaller, so when I convert it back to PDF, it does not fit the whole page, and I still have to manually adjust all the CSS to fit it.
So after using these 3rd party softwares, I convert back the HTML files into PDF using MPDF, and the converted files has so much inconsistencies. Texts are not aligned properly. It's basically not the same as the original PDF.
Any help will be appreciated thanks!