Questions tagged [pdf-to-html]
79 questions
1
vote
1 answer
Executing a shell command within PHP
On the terminal, I run this successfully within the web application's directory:
pdftohtml -c -noframes "documents/document1.pdf"
Now I want to do this through PHP, so I wrote a shell.sh file that looks like this:
sudo pdftohtml -c -noframes…

Mohamed Khamis
- 7,731
- 10
- 38
- 58
0
votes
2 answers
Javascript based horizontal Scrolling of a multi-page PDF?
I'm wondering how I can accomplish horizontal scrolling of the pages of a PDF using JavaScript. Is it better to:
Convert the pages of the PDF into HTML files and then click left-right between iframes where src="...each page.html"?
Convert the…

tim peterson
- 23,653
- 59
- 177
- 299
0
votes
0 answers
Remove defaultStyle from PDFDomTree
I like to convert pdf document into html content. I am using PDF2Dom lib version 2.0.3
org.apache.pdfbox
pdfbox
2.0.18
Conversion…

vikifor
- 3,426
- 4
- 45
- 75
0
votes
0 answers
How to install poppler-utils on godaddy server without ssh
I just want to install poppler-utils on my development server to convert pdf to HTML. But unfortunately, I don't have SSH access. has somebody known how to install it without SSH?

Abbas Mastan
- 107
- 11
0
votes
0 answers
I am converting PDF to html using gufy/pdf-to-html package on GitHub . But it showing error that: Undefined array key "pages"
the controller code is below
public function generate(Request $req)
{
Config::set('pdftohtml.bin', 'C:/poppler-0.37/bin/pdftohtml.exe');
Config::set('pdfinfo.bin', 'C:/poppler-0.37/bin/pdfinfo.exe');
$pdf = new…

Abbas Mastan
- 107
- 11
0
votes
0 answers
using Adobe Acrobat SDK in linux
Hi i'm trying to automate PDF to HTML conversion on my linux cloud using Adobe Acrobat SDK. I know there are a lot of PDF to HTML packages out there but nothing preserves the layout of my pdf as Adobe does. Is there a way to call Adobe Acrobat…
0
votes
1 answer
Arial font not working on PDF file on ubuntu
I have a PDF which shows properly on the browser but some text is not showing on the PDF viewer on on UBUNTU, I checked the fonts of the pdf and it return
Syntax Error: non-embedded font using identity encoding: Arial-BoldMT
Syntax Error:…

Vikram
- 3,171
- 7
- 37
- 67
0
votes
0 answers
Using co-ordinates in XML generated by poppler to build an email template
Generated a 72 dpi image and XML with zoom as 1 from this PDF.
Although the DPI was 72, to be able to make the conversion of co-ordinates in the XML to pixel possible had to iteratively tweak the DPI using this sheet. 90.5 seems to work well.…

qwertynik
- 118
- 2
- 10
0
votes
1 answer
I am using PHP Chrome HTML 2 PDF but unable to use Header and Footer in the PDF
Please find the below code:
This is the Content";
$input = new…
0
votes
1 answer
Choose encoding for pdftohtml
How can I force pdftohtml output to be UTF 8?
$ pdftohtml -enc utf8 my.pdf
Error: Couldn't find unicodeMap file for the 'utf-8' encoding
And -listenc doesn't seem to be a valid option.
I think it is using ISO-8859-1 by default (although for some…

theonlygusti
- 11,032
- 11
- 64
- 119
0
votes
0 answers
PDF to HTML converter - Stuck
I need to have just one pdf on my website and HTML file. I dont need to be making them on my website I just need to add one pdf to a page and put text over it. Does anyone know of the best way to convert the pdf to HTML. I have found places like…

Russell Hertel
- 121
- 3
- 14
0
votes
1 answer
Pandas dataframe.to_html() - edit text color and add background color of header columns
I am exporting a pandas dataframe as an html table and have been playing around with styling the the header columns in the final table. Here is an example dataframe I have generated:
x = np.arange(0,100,5)
y = np.arange(0,20,1)
example_df =…

Jake Niederer
- 81
- 11
0
votes
1 answer
I am trying to extract data as HTML elements in python using pdfminer
I am trying extract data as HTML from pdf using pdfminer although I was successful to extract text from the same pdf now I am getting an error while extracting data as HTML I have to filter the data further to categorize it in CSV. This is the…

Rajat Nagarkar
- 11
- 2
0
votes
1 answer
Replacing Images with Image Names instead in Pdf using pymupdf
Using PyMuPDF, I want to extract all images from pdf and save them separately and replace all images in pdf with just their image names at the same image place and save as another document. I can save all images with following code.
import…

Mohammad Ahmed
- 57
- 1
- 1
- 6
0
votes
1 answer
How to convert PDF generated from reportlab in Python to HTML
I have finished generating a PDF with tables, headers and clickable TOC styled nicely. Now I would like to have an HTML version. Is it possible to use the same ReportLab to just easily generate the HTML file?

JA-pythonista
- 1,225
- 1
- 21
- 44