Questions tagged [pdf-conversion]

Relating to converting between Portable Document Format and other file formats. Questions asking us to recommend or find a conversion tool or library are off-topic.

This tag is for questions relating to programmatically converting to and from the open standard file format . If a specific conversion is involved, the appropriate tag should also be used: etc.

Conversion solutions may range from complete rasterization (and graphic embedding) to intense . The middle ground generally converts at a high enough level to recognize and use text attributes where possible, falling back to graphic rendering only when necessary.

Questions asking us to recommend or find a tool, library, documentation or other off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam.

266 questions
4
votes
1 answer

How to deal with unicode character encoding issues while converting documents from PDF to Text

I am trying to extract text from a PDF. The PDF contains text in Hindi (Unicode). The utility for extraction I am using is Apache PDFBox ( http://pdfbox.apache.org/). The extractor extracts the text, but the text is not recognizable. I tried…
4
votes
0 answers

Export Flash Frame as PDF

I'm looking for a toolkit/command line/library that will let me export frames or a single frame of a .swf to a pdf. I'd rather not have to write my own converter. I'm looking for a vector solution...not bitmap. There is a Swftools thread on…
Eugene
  • 10,957
  • 20
  • 69
  • 97
4
votes
1 answer

Concatenating multiple page pdf into single page pdf

So I have a multi page pdf that looks something like this multipage This currently has more than one page but I would like to concatenate. These two pages should be concatenated in a way such that it becomes one page. (Literally joining the two…
skspawn
  • 53
  • 5
4
votes
1 answer

Dramatic speed difference between PDF to PNG vs. PDF to JPEG

Using pdftocairo, on a Xeon E5-2630 (2.3GHz) CentOS 6.3 machine, poppler 0.24, cairo 1.12, libpng 1.2.49, openjpeg 1.3.10 (both CentOS default), I tested converting a 37 page PDF to convert to JPEG and PNG. I ran pdftocairo with no special options…
Victor
  • 584
  • 4
  • 16
4
votes
4 answers

PDF compressing library/tool

I am working on a project to reduce the size of the PDF's, compress them. I am wondering are there any good tools/library (.NET) in market that are really good. I did try few tools like Onstream Compression, but the results were not satisfactory.
Sabby62
  • 1,707
  • 3
  • 24
  • 37
3
votes
1 answer

Context deadline exceeded- goteberg api (/forms/libreoffice/convert)

I am trying to convert ms-office files to pdf using gotenberg api. For some files, i am getting unoconv PDF context deadline exceeded with 503 status. I have increased the read, write and process timeout to 60 secs. How can i resolve this issue?…
3
votes
0 answers

Exit with code 1 due to network error: HostNotFoundError in Python

I am trying to convert HTML to pdf and I am using "wkhtmltopdf" code is as follows import pdfkit path_wkhtmltopdf = r'C:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe' config = pdfkit.configuration(wkhtmltopdf=path_wkhtmltopdf) urls=…
anonymous13
  • 581
  • 1
  • 5
  • 17
3
votes
1 answer

ExpertPdf Conversion error: WebKit Navigation timeout in GetPdfBytesFromUrl

I am working on upgrading a .NET application without changing its functionality. What I did so far is: upgrading from mvc 2 to mvc 4 upgrading from framework 3.5 to 4.0 using visual studio 2015 instead of 2010 upgrading the custom made utilities…
LAN
  • 55
  • 1
  • 9
3
votes
1 answer

How to manage LocalConverter and when invoke ShutDown() method?

I wrote some code using documents4j library to convert some documents from .docx to .pdf. I followed the examples in the documentation and the convertion works perfectly using MS-Word, but I notice that after all conversions complete and methods…
D. Pesc.
  • 196
  • 15
3
votes
1 answer

table width not set in iTextSharp when converting html to PDF

I am trying to convert an html to pdf but the problem i face is that the html table tags width is not getting set correctly.. This is my html
SKumar
  • 33
  • 1
  • 1
  • 5
3
votes
1 answer

Convert DOC to PDF using unoconv via Symfony Component

I'm trying to convert word documents to PDF, via the commandline using unoconv via PHP. I'm using the Symfony Process Component to run the command via the command line. public function run() { $cmd = 'unoconv --listener & unoconv ' .…
Kiee
  • 10,661
  • 8
  • 31
  • 56
3
votes
1 answer

Libreoffice command line crashes opening DOCX or converting to PDF, on Windows 7

Need to convert DOCX document to PDF using LibreOffice in command line mode. But it crashes: soffice.exe -headless -invisible -convert-to pdf myfile.docx It also crashes when trying to open the same document: soffice.exe -o myfile.docx However,…
German Latorre
  • 10,058
  • 14
  • 48
  • 59
3
votes
1 answer

Embed PDF using pdf.js to the webpage

pdf.js is a bit big project for a newbie like me. As most post said this project is great tool to embed a PDF file into web. But I'm quite having a hard time figuring this out of how to use it. What I want to know is how can I embed a local PDF…
user2785929
  • 985
  • 7
  • 13
  • 30
3
votes
4 answers

Best setting for scanners for scanning documents(TIFF and PDF)

What are the best settings for scanner in order to scan documents(white & black text) and use them for OCR conversion(for best results) and what are standard settings and specification for PDF and TIFF format ?
Lohith MV
  • 3,798
  • 11
  • 31
  • 44
3
votes
2 answers

JAVA: passing an input or output stream to ITextRenderer (xhtml to pdf converter)

I want to convert my XHTML text to a PDF. I converted it to FileOutputStream but I ca'nt find a way to pass it as an input to the ITextRenderer. Is that possible, and how? the code : String finalXhtml=xhtmlparser(xmlText); ByteArrayInputStream…
mohammad
  • 2,142
  • 7
  • 35
  • 60
1
2
3
17 18