Can't get the text from pdf

Asked Nov 28 '22 at 13:07

Active Nov 28 '22 at 13:07

Viewed 35 times

When i try to parse the pdf, i can't get the content of pdf but getting random symbols and characters. What is the reason behind it? This should give the proper text. I have tried using PyPDF2 also still can not get the text.

filename = "test2.pdf"
with fitz.open(filename) as f:
    for p in f:
        print("\n\n")
        print(p.get_text(sort=True))

Result : enter image description here This type of result i get.

asked Nov 28 '22 at 13:07

Hemil Parmar

1

you should post the PDF in question as well, so the issue can be reproduced. – hanshenrik Nov 28 '22 at 13:14
FYI: this question has been asked and answered on PyMuPDF's home page already! – Jorj McKie Nov 29 '22 at 14:44

Can't get the text from pdf

0 Answers0