0

I am processing the email files from Amazon SES. The email contains a pdf file encoded with base64.

Content-Type: application/octet-stream; name="IN081AKC.pdf"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="IN081AKC.pdf"

encoded_string = “JVBERi0xLjI……………….”

I am trying to decode the string with base64.b64decode(encoded_string) but getting the exception

   return base64.b64decode(encoded_string)
  File "/usr/local/Cellar/python@3.9/3.9.9/Frameworks/Python.framework/Versions/3.9/lib/python3.9/base64.py", line 87, in b64decode
    return binascii.a2b_base64(s)
binascii.Error: Invalid base64-encoded string: number of data characters (12693) cannot be 1 more than a multiple of 4

I checked the padding also, len(encoded_string)%4 = 0

The encoded_string ends with "A===". If I remove "A===" or add "A==" then I am able to decode the string. Can we remove "A===" from the end of the string?

  • The string ends with A===. If I remove A=== then I am able to decode it. But I am sure if this is the right thing to do. – Sushma Yadav Mar 24 '22 at 08:49
  • If I remove A===, then the decoded string ends with "91\r\n%%EOF\r\n". And if I add A== then the decoded string ends with "91\r\n%%EOF\r\n\x00'". Is it safe to remove A===? – Sushma Yadav Mar 30 '22 at 10:12

0 Answers0