1

I need to convert a PDF document to HTML and after editing the html I then convert this HTML to PDF . I use 'pdftohtml' ubuntu command (pdftohtml - program to convert pdf files into html, xml and png images) like PHP code below

<?php $output = shell_exec('pdftohtml create.pdf updated.html'); ?>

It convert the whole document successfully but it pass all image in top of the page. Can anyone help me to do this job ?

Nadimul De Cj
  • 484
  • 4
  • 16

1 Answers1

0
  • You can preserve the layout of your document (headers, footers, paging, etc.) from the original PDF file in the converted html file using the “-layout” flag.

    $output = shell_exec('pdftohtml -layout create.pdf updated.html');
    
  • If you want to only convert a range of pages in a PDF file, use the “-f” and “-l” (a lowercase “L”) flags to specify the first and last pages in the range you want to convert.

    $output = shell_exec('pdftohtml -f 5 -l 9 create.pdf updated.html');
    
  • To convert a PDF file that’s protected and encrypted with an owner password, use the “-opw” flag (the first character in the flag is a lowercase letter “O”, not a zero).

    $output = shell_exec('pdftohtml -opw ‘password’ create.pdf updated.html');
    

Source

Abdelrahman Wahdan
  • 2,056
  • 4
  • 36
  • 43
  • i did all you mention ... your code works perfect but my concern is when i convert the PDF it move all images of PDF in the top of the html tag .... how can i prevent it ?? – Nadimul De Cj Jan 26 '16 at 09:32
  • Which `PHP` library exactly did you use ? I have been using this one and it works perfectly : github.com/mgufrone/pdf-to-html – Abdelrahman Wahdan Jan 26 '16 at 19:17
  • ok i apply your mentioned library ... it also convert it successfully but my problem is image .. image go to the top ... here my PDF screenshot : [pdf](https://www.dropbox.com/s/ncxcioipdg73rgd/pdf-image.gif?dl=0) ... and after convert HTML screenshot : [html](https://www.dropbox.com/s/cyx6cawukh4we6c/html-image.gif?dl=0) – Nadimul De Cj Jan 27 '16 at 05:09
  • Well, I expected to see a screenshot for `HTML` page not `PDF` file, because you mentioned that this problem occurs in the `HTML` page after conversion. – Abdelrahman Wahdan Jan 27 '16 at 05:14
  • OK This is the HTML code screenshot [HTML screenshot](https://www.dropbox.com/s/u27zv05z4sp0za3/html-view.png?dl=0) – Nadimul De Cj Jan 27 '16 at 06:16