Highest Voted 'xpdf' Questions

1

vote

1 answer

Specific version of pdftotext binary (old version of poppler-utils is not same version)?

Been digging for ages and struggling to find the answer. Have version 0.39 of a single binary pdftotext on our OSX dev systems (installed using brew install poppler. We cannot find other versions brew search poppler only has a single one. We are…

pdftotext xpdf

asked Sep 05 '16 at 08:54

Ben

1,292
1
13
21

1

vote

2 answers

Installing pdftotext on Windows (for use with R, 'tm' package)

I am having trouble using R, 'tm' package, to read in .pdf files. Specifically, I try to run the following code: library(tm) filename = "myfile.pdf" tmp1 <- readPDF(PdftotextOptions="-layout") doc <-…

r tm pdftotext xpdf

asked Mar 23 '16 at 11:49

SuperUser01

199
1
13

1

vote

1 answer

Using readPDF in R (tm package)

I'm a beginner at R and having a bit of trouble using the tm package. I need to extract specific data from page 55 through 300 of this and thought that R might be a good way to do so. (If anyone has a better idea, please let me know!) I did some…

r text-mining xpdf

asked Aug 18 '15 at 01:03

JDY

167
2
10

1

vote

1 answer

'pdftotext' errors encountered on Windows 7 -- same PDFs processed correctly under Linux

I have an old Linux version (0.12.4) of pdftotext that runs without problems, but I would like to run it on a Windows 7 machine. I downloaded the Windows installer for what appears to be the latest version, xpdf-2.03-bin.exe from…

linux windows-7 pdftotext poppler xpdf

asked Oct 10 '14 at 18:46

LFleming

21
3

1

vote

2 answers

Using AJAX and PHP to output PDF

The way my web app is supposed to work is that the user fills out a form and then the AJAX sends the form data to a PHP file that generates a PDF (using xpdf). Then the generated PDF should be available for download on the HTML page with the AJAX.…

php javascript jquery ajax xpdf

asked Sep 05 '13 at 22:27

Ben Davidow

1,175
5
23
51

1

vote

1 answer

Discrepany between PDF cropbox and SVG created out of a PDF page

I am trying to extract the background image of a PDF page to an SVG (using xpdf library). The problem I am facing is that the PDF contains additional images/graphics (presumably outside the cropbox) that are not rendered by PDF readers, but the…

pdf svg xpdf

asked Aug 26 '13 at 18:28

so2

322
2
13

1

vote

1 answer

How to identify and extract vector graphics from PDF using xpdf library?

Does anyone have a sample code demonstrating how to extract vector graphics objects (such as those representing charts and flow diagrams) from a PDF using XPDF library? There doesn't seem to be any documentation available on the Web for xpdf library…

pdf xpdf

asked Mar 22 '13 at 12:37

so1

58
1
10

1

vote

0 answers

parsing pdf content stream to understand paragraph boundary

Is there a way to parse the pdf content stream and identify paragraph boundary? I read ISO 32000-1:2008 but could not understand if, the pdf content stream contains any operator which tells a display software to start the paragraph, or end it. Can…

pdf pdfbox xpdf

asked Feb 15 '13 at 20:59

rivu

2,004
2
29
45

1

vote

1 answer

why from scanned documents, text can be extracted, but not image

I asked a similar question before, in stackoverflow. I wanted to ask another related question, so I am rephrasing the original question again. I was using PDFBox to extract image and text from a pdf, available in skydrive and scribd. I had…

pdf pdfbox xpdf

asked Feb 12 '13 at 22:08

rivu

2,004
2
29
45

0

votes

0 answers

I can't get PDF document file path with PHP-XPDF

I have a Wordpress site installed on a VPS with Debian 11. One of the functionalities is reading uploaded PDF documents using the XPDF library and PHP wrapper PHP-XPDF: https://github.com/alchemy-fr/PHP-XPDF, which uses XPDFReader:…

php debian pdftotext xpdf

asked Jun 16 '23 at 10:09

weezle

79
1
6
15

0

votes

0 answers

pdftops to covert PDF to EPS in basic mode?

I'm using pdftops in a script to convert PDF to EPS. However looks that is not the "basic" EPS format, and I can't open it with Photopea, I'm getting this error: The command that I'm executing is this one: pdftops -eps -level2sep file.pdf…

cairo eps poppler xpdf poppler-utils

asked Nov 11 '22 at 15:33

Aral Roca

5,442
8
47
78

0

votes

2 answers

Converting pdf to text

I need to create a C# or C++ (MFC) application that converts pdf files to txt. I need not only to convert, but remove headers, footers, some garbage characters on the left margin etc. Thus the application shold allow the user to set page margins to…

c# c++ pdf xpdf

asked Sep 14 '11 at 18:38

dpreznik

247
5
18

0

votes

1 answer

Make xpdf Pdf2Txt function as thread safe

I have tried to use xpdf source code into a MFC application to convert pdf to text. The code sample is taken from their site (or repository): int Pdf2Txt(std::string PdfFile, std::string TxtFile) const { GString* ownerPW, *userPW; …

c++ multithreading mfc xpdf

asked Sep 19 '22 at 11:19

Flaviu_

1,285
17
33

0

votes

1 answer

Convert all text's color in PDF to black while ensuring text is selectable

Looking for ways to change the color of all text in a PDF to black using an open-source command-line tool (or package) while ensuring that text is rendered as text. Thanks to some answers on SO, found a command to convert the PDF to grayscale. gs -o…

pdf ghostscript postscript mupdf xpdf

asked Mar 14 '22 at 11:41

qwertynik

118
2
10

0

votes

1 answer

path should be string, bytes or os.PathLike, not InMemoryUploadedFile

In django I get the file uploaded by the user with input_pdf = request.FILES['pdf'] and I want to extract fiel text with pdftextract library with pdf = XPdf(input_pdf) but it gives an error: TypeError: _getfullpathname: path should be string, bytes…

python pdf xpdf

asked Sep 08 '21 at 18:48

Meysam

105
1
1
6

Questions tagged [xpdf]