1

I am creating a PDF in python using the borb library. Everything is great and i've found it really easy to work with.

However, I now have an object that is of type 'Document' in memory. I don't want to actually save this file anywhere. I just want to upload it to S3.

What is the best way to convert this file to bytes so that it can be uploaded to S3?

Below is the actual code. As well as some things i've tried.

async def create_invoice_pdf():
    # Create document
    pdf = Document()

    # Add page
    page = Page()
    pdf.append_page(page)

    page_layout = SingleColumnLayout(page)
    page_layout.vertical_margin = page.get_page_info().get_height() * Decimal(0.02)

    page_layout.add(    
        Image(        
        "xxx.png",        
        width=Decimal(250),        
        height=Decimal(60),    
        ))

    # Invoice information table  
    page_layout.add(_build_invoice_information())  
    
    # Empty paragraph for spacing  
    page_layout.add(Paragraph(" "))

    # Itemized description
    page_layout.add(_build_itemized_description_table())

    page_layout.add(Paragraph(" "))

    # Itemized description
    page_layout.add(_build_payment_itemized_description_table())

    upload_to_aws(pdf, "Borb/Test", INVOICE_BUCKET, "application/pdf")

Result:

TypeError: a bytes-like object is required, not 'Document'

Using

 pdf_bytes = bytearray(pdf)
 upload_to_aws(pdf, "Borb/Test", INVOICE_BUCKET, "application/pdf")

Result:

TypeError: 'Name' object cannot be interpreted as an integer

Using

s = str(pdf).encode
pdf_bytes = bytearray(s)
upload_to_aws(pdf_bytes, "Borb/Test", INVOICE_BUCKET, "application/pdf")

Result:

File is uploaded, but is corrupted and can not be opened after download

I am able to save the file locally using:

with open("file.pdf", "wb") as pdf_file_handle:
        PDF.dumps(pdf_file_handle, pdf)

But I don't actually want to do this.

Any idea? Thanks in advance

Joris Schellekens
  • 8,483
  • 2
  • 23
  • 54
bruzza42
  • 393
  • 2
  • 13
  • 1
    You have an in-memory Python object. It is not a PDF. There is no way that converting it to bytes is going to work because nothing you have tried *actually creates a PDF*, except the `dumps()` method you say you don't want to use. You can't avoid that method because you need `borb`'s facilities to turn its internal data structures into PDF-style compressed PostScript. The best you can do is create an in-memory file using `io.StringIO`. You need to upload the compressed Postscript representation of your document. You can't do that without first creating it. – BoarGules May 31 '22 at 08:52

1 Answers1

2

Managed to figure it out:

PDF.dumps can be used outside of the with open...

and then it is a simple io buffer

buffer = io.BytesIO()

PDF.dumps(buffer, pdf)
buffer.seek(0)

upload_to_aws(buffer.read(), "Borb/Test.pdf", INVOICE_BUCKET, "application/pdf")
bruzza42
  • 393
  • 2
  • 13