1

I am using PDFTOHTML (a php library) to convert pdf files to html and it's working fine but it's showing converted file in a browser and not storing in local folder, i want to store converted html in local folder using php with the same name as pdf was i-e mydata.pdf to mydata.html Code that is converting pdf to html is:-

 <?php
// if you are using composer, just use this
include 'vendor/autoload.php';

 $pdf = new \TonchikTm\PdfToHtml\Pdf('cv.pdf', [
     'pdftohtml_path' => 'C:/wamp64/www/new/poppler-0.51/bin/pdftohtml.exe',
    'pdfinfo_path' => 'C:/wamp64/www/new/poppler-0.51/bin/pdfinfo.exe'
]);

// get content from all pages and loop for they
foreach ($pdf->getHtml()->getAllPages() as $page) {
    echo $page . '<br/>';
}
?>
Zohaib
  • 159
  • 1
  • 13

2 Answers2

1

Just change your foreach to

$filePdf = 'cv'; // your pdf filename without extension
$pdf = new \TonchikTm\PdfToHtml\Pdf($filePdf.'.pdf', [
    'pdftohtml_path' => 'C:/wamp64/www/new/poppler-0.51/bin/pdftohtml.exe',
    'pdfinfo_path' => 'C:/wamp64/www/new/poppler-0.51/bin/pdfinfo.exe'
]);

$counterPage = 1;
foreach ($pdf->getHtml()->getAllPages() as $page) {
    $filename = $filePdf . "_" . $counterPage.'.html'; // set as string directory and filename where you want to save it

    if (file_exists($filename)) {
        // if file exist do something
    } else {
        // else 
        $fileOpen = fopen($filename, 'w+');
        fputs($fileOpen, $page);
        fclose($fileOpen);
    }
    $counterPage++;
    echo $page . '<br/>';
}

This will create you file for example: example_1.html, example_2.html and so on. if this not help you then probably you need to use file_put_contents with ob_start() and ob_get_contents() read more here

  • showing error `( ! ) Parse error: syntax error, unexpected ''.html'' (T_CONSTANT_ENCAPSED_STRING) in C:\wamp64\www\new\example.php` and we have to define name of html every time we pass a pdf to it, is there any with which it can get the pdf name through code? – Zohaib Apr 22 '18 at 07:47
  • yea I forgot to put "." Change fopen($filename'.html', 'w+'); to fopen($filename.'.html', 'w+'); – Sider Topalov Apr 22 '18 at 07:50
  • About name of html.. Does your pdf file content one page or more than one – Sider Topalov Apr 22 '18 at 07:53
  • sometimes there is only one page but mostly there are two – Zohaib Apr 22 '18 at 07:55
  • Does your pdf content separeted names for each page inside? – Sider Topalov Apr 22 '18 at 07:56
  • Then you can do something like that check my answer again i will modified it now – Sider Topalov Apr 22 '18 at 07:58
  • it is only saving first page not second – Zohaib Apr 22 '18 at 08:14
  • Delete all created pages then try again if is not saving your second page then this mean that it enter in if case not else case checkout your code i think you can do it – Sider Topalov Apr 22 '18 at 08:15
  • brother i'av tried but still second page not showing, u were of great help for me it would be great if you just do me this favour ,thanks!! – Zohaib Apr 22 '18 at 08:24
  • Please var_dump($filename) for me and link it below to see what names are building – Sider Topalov Apr 22 '18 at 08:25
  • okay where do you want me to var_dump it? and as i am working on local right now i will share you a screenshot of output is that okay? – Zohaib Apr 22 '18 at 08:29
  • Before you share me a screen shot check my code and make sure is the same as your like variable names and other thinks and try again and then if you still get the same problem share me a screenshot of output – Sider Topalov Apr 22 '18 at 08:30
  • This work for me please make sure you are not missing something – Sider Topalov Apr 22 '18 at 08:38
  • Now it's storing both pages in different html i-e **1 page to cv_1.html and 2 page to cv_2.html** now i'll try to store data of both pages into one html page Many thanks :) – Zohaib Apr 22 '18 at 08:45
0

Look this :

<?php
// if you are using composer, just use this
include 'vendor/autoload.php';
$pdf = new \TonchikTm\PdfToHtml\Pdf('cv.pdf', ['pdftohtml_path' => 'C:/wamp64/www/new/poppler-0.51/bin/pdftohtml.exe', 'pdfinfo_path' => 'C:/wamp64/www/new/poppler-0.51/bin/pdfinfo.exe']);
// get content from all pages and loop for they
$file = fopen('cv.html', 'w+');
$data = null;
foreach ($pdf->getHtml()->getAllPages() as $page) {
    $data .= "".$page."<br/>";
}
fputs($file, $data);
fclose($file);

I did not test this code

  • @simon511000 yes it is storing it in local but in this we have to define name of html every time we pass a pdf to it, is there any with which it can get the pdf name through code? – Zohaib Apr 22 '18 at 07:29