Highest Voted 'pdf2image' Questions

1

vote

0 answers

PDF2image on AWS Lambda - resulted PNG has wrong fonts

I am using pdf2image convert_from_bytes on my own PDFs in order to get them in PNG format. The context is AWS Lambda, py 3.8. ... images = convert_from_bytes(infile, dpi=DPI, fmt=FMT) for…

asked May 03 '21 at 21:16

TaiT's

3,138
3
15
26

1

vote

0 answers

Poppler is installed, pdf2image path error seems to have no resolution. Has this been fixed?

I am running debian buster on the docker image. I have installed every poppler package to rule anything unusual out. I have explicitly added the paths to all of the poppler files, containing directories, etc. I have followed the documentation…

python-3.x poppler debian-buster pdf2image

asked May 03 '21 at 00:29

Chris

28,822
27
83
158

0

votes

1 answer

Convert very large PDF to images with python

I have an extremely large PDF containing scans that are approximately 30.000px wide (wtf!). I have a python script that works well for normal sized PDF but when confronted to this large PDF outputs only 1 pixel wide white squares as images. The…

python pdf python-imaging-library pdf2image

asked Jul 11 '23 at 08:46

Seglinglin

447
1
4
17

0

votes

1 answer

How to fix TypeError: expected str, bytes or os.PathLike object, not UploadedFile

I'm trying to make OCR-platform using streamlit and easyocr. I already managed to do text conversions from images, but I can’t convert PDF to JPG in order to continue further processing. I tried downloading the pdf, then converting it to jpg, then…

python pdf ocr streamlit pdf2image

asked Jul 02 '23 at 10:56

Алексей Курочкин

1
1

0

votes

0 answers

Killed error while removing watermark in PDF & Merging the images to get PDF in VsCode. [ OS : Ubuntu ]

Lat two lines of my output shown in my Terminal Adding TEXT TOP @ LEFT CORNER for AP ECET 2020 Electronics and Communication Engineering September 14, 2020 Shift 2 English Question Paper Killed Removing the Watermark step from pdf is working…

python pypdf pdf2image

asked May 25 '23 at 07:58

Inflection

1

0

votes

2 answers

How can I convert image coordinates to PDF coordinates when using pdf2image and table-transformers?

I am using pdf2image to convert pdf to images and detecting tables with table-transformers. I need help with coordinates. Issue is, I am getting perfect table borders but pixels in images are different from PDF coordinates. Any way to convert image…

python python-3.x pdf2image

asked May 22 '23 at 07:48

siddharth patel

31
8

0

votes

1 answer

Pdf file produces blank

I am creating a PDF file without text from a pdf file with text using the following program def remove_text_from_pdf(pdf_path_in, pdf_path_out): '''Removes the text from the PDF file and saves it as a new PDF file''' #Open the PDF file with the…

python pdf-reader pdf2image pdf-writer

asked Mar 10 '23 at 09:58

Rutvij Gholap

3
1

0

votes

0 answers

Obtained position of tables in pdf and plot the bounding box on the image

Following this script, I could know the bounding box of the tables in my e-pdf: tabula.read_pdf(file, stream=True,guess=True,lattice=False,multiple_tables=True, output_format="json", pages=pg_num) However, I want to plot the bounding boxes detected…

python computer-vision tabula-py pdf2image

asked Feb 18 '23 at 08:26

skw1990

63
6

0

votes

0 answers

PDFPageCountError: Unable to get page count

I am trying to use pdf2image, but I am getting this error: PDFPageCountError: Unable to get page count. I/O Error: Couldn't open file 'C:\Users\user_name\Desktop\folder_name\folder2_name\folder3_name\007-084841-1 to 31 Dec'22': No error. It is…

pdf2image

asked Jan 11 '23 at 15:22

CrisD

1
1

0

votes

0 answers

Error with the path of the poppler folder

I am getting the following when using a script with a poppler path : pdf2image.exceptions.PDFInfoNotInstalledError: Unable to get page count. Is poppler installed and in PATH? But poppler is correctly installed on my computer. The code I am using…

python path poppler pdf2image

asked Jan 09 '23 at 09:23

clemdcz

99
7

0

votes

0 answers

How to remove boxes around shx text without AutoCAD?

I try to use OCR (Optical Character Reader) for a lot of documents of the same type. I use pdf2image library for Python. But when it sees pdfs with AutoCAD shx text it captures the bounding boxes around text as well. At first they are not visible on…

python pdf autocad pdf2image

asked Dec 22 '22 at 10:59

ArsenK

1

0

votes

0 answers

Converting multi page pdf to jpeg results in a single page

Have written this python script to convert multi page pdfs to jpeg. import requests, io from pdf2image import convert_from_bytes url = 'http://www.asx.com.au/asxpdf/20171108/pdf/43p1l61zf2yct8.pdf' response = requests.get(url) pages =…

python buffer pdf2image

asked Dec 09 '22 at 10:28

Rahul

895
1
13
26

0

votes

0 answers

Difference in Length of ImageBytes while performing PIL IMAGE .getvalue() operation on AWS LAMBDA?

I am trying to perform .getvalue() operation on PIL image on AWS Lambda to extract the bytes of PIL Image but my byte string length is different when i perform this operation on Local Machine and Its different when i Perform it on AWS Lambda, below…

python-3.x amazon-web-services aws-lambda python-imaging-library pdf2image

asked Dec 07 '22 at 07:36

Rahul Pidkalwar

16
1

0

votes

0 answers

Django convert InMemoryUploadedFile PDF to images

I need to convert uploaded PDF to images. I'm using pdf2image function convert_from_path() to convert the image but am getting an error Unable to get page count. My code looks somewhat like this: pages =…

python django django-forms django-file-upload pdf2image

asked Nov 30 '22 at 21:01

Marian

13
5

0

votes

1 answer

Python: pdf2image doesn't write .jpg - no error message

I'm working on a python script that checks the .pdf files in a directory, creates a new directory for each file, converts the .pdf into images, and writes the images as jpg into the new directory. I'm using pdf2image and have the following…

python python-3.x pdf2image

asked Nov 20 '22 at 13:12

bluesky

1
2

Questions tagged [pdf2image]

Resources

PDF2image on AWS Lambda - resulted PNG has wrong fonts

Poppler is installed, pdf2image path error seems to have no resolution. Has this been fixed?

Convert very large PDF to images with python

How to fix TypeError: expected str, bytes or os.PathLike object, not UploadedFile

Killed error while removing watermark in PDF & Merging the images to get PDF in VsCode. [ OS : Ubuntu ]

How can I convert image coordinates to PDF coordinates when using pdf2image and table-transformers?

Pdf file produces blank

Obtained position of tables in pdf and plot the bounding box on the image

PDFPageCountError: Unable to get page count

Error with the path of the poppler folder

How to remove boxes around shx text without AutoCAD?

Converting multi page pdf to jpeg results in a single page

Difference in Length of ImageBytes while performing PIL IMAGE .getvalue() operation on AWS LAMBDA?

Django convert InMemoryUploadedFile PDF to images

Python: pdf2image doesn't write .jpg - no error message