2

I have created an rest-api using fastapi, which takes a document (pdf) as input and return a jpeg image of it, I am using a library called docx2pdf for conversion.

from docx2pdf import convert_to    
from fastapi import FastAPI, File, UploadFile

app = FastAPI()

@app.post("/file/convert")
async def convert(doc: UploadFile = File(...)):
    if doc.filename.endswith(".pdf"):
        # convert pdf to image
        with tempfile.TemporaryDirectory() as path:
            doc_results = convert_from_bytes(
                doc.file.read(), output_folder=path, dpi=350, thread_count=4
            )

            print(doc_results)

        return doc_results if doc_results else None

This is the output of doc_results, basically a list of PIL image files

[<PIL.PpmImagePlugin.PpmImageFile image mode=RGB size=2975x3850 at 0x7F5AB4C9F9D0>, <PIL.PpmImagePlugin.PpmImageFile image mode=RGB size=2975x3850 at 0x7F5AB4C9FB80>]

If I run my current code, it is returning the doc_results as json output and I am not being able to load those images in another API.

How can I return image files without saving them to local storage? So, I can make a request to this api and get the response and work on the image directly.

Also, if you know any improvements I can make in the above code to speed up is also helpful.

Any help is appreciated.

user_12
  • 1,778
  • 7
  • 31
  • 72

1 Answers1

3

You can not return that unless you convert it to something universal.

<PIL.PpmImagePlugin.PpmImageFile image mode=RGB size=2975x3850 at 0x7F5AB4C9F9D0

This basically says, You have an object of PIL at your memory here is the location for it.

The best thing you can do is, convert them to bytes and return an array of bytes.


You can create a function that takes a PIL image and returns the byte value from it.

import io

def get_bytes_value(image):
    img_byte_arr = io.BytesIO()
    image.save(img_byte_arr, format='JPEG')
    return img_byte_arr.getvalue()

Then you can use this function when returning the response.

return [get_bytes_value(image) for image in doc_results] if doc_results else None
Sung Kim
  • 8,417
  • 9
  • 34
  • 42
Yagiz Degirmenci
  • 16,595
  • 7
  • 65
  • 85
  • Can you provide some example code to help me out a bit, I have been trying to achieve it from last few hours but so far not able to figure our how to return an array of bytes ? – user_12 Jan 06 '21 at 20:16
  • 1
    I am getting a error `UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte`, I have tried this solution but didn't worked for me. – user_12 Jan 06 '21 at 20:31
  • Have you tried the solutions for that error from this [question](https://stackoverflow.com/questions/42339876/error-unicodedecodeerror-utf-8-codec-cant-decode-byte-0xff-in-position-0-in)? – Yagiz Degirmenci Jan 06 '21 at 20:41
  • 1
    I was able to figure out the issue, I had to base64 encoding. – user_12 Jan 06 '21 at 20:54
  • Ah great, glad it helped. – Yagiz Degirmenci Jan 06 '21 at 22:21