I have used zlib
python library to decode stream which were compressed using FlateDecode
. Until now, all the pdf files I have worked with, showed correct values in Tj
and TJ
operators but I am facing issue decoding this pdf as I am not getting what's displayed in the PDF.
I am able to copy text from the PDF to notepad without any issue and also pdftotext
is giving expected results with correct words as output.
I have also used Adobe Preflight to see the document's internal structure to double check the decoded text I am getting via zlib
but even that shows garbage values and it doesn't match to what's displayed in the PDF.
Why do I get this garbage value in text operators and how is pdftotext
still able to get the correct results ?
Also, How do I get correct results via python/zlib
?