0

I am getting scanned/photographed images from user on my web-app (Python Flask) which inturn is stored in Azure Blob. I need to generalize all images to a standard resolution along with reducing the file size. I intend to apply OCR on these images, so retaining image quality is important too.

I need to do this on my server (python flask) before the image is stored on Azure Blob. I found libraries like picopt which didn't directly address the issue. There are also some scripts available which can be called from console, but I need to execute them every time automatically.

Sorry for being naive here but can anybody suggest a solution so I can do this within the python flask app?

I am reading the file by file = request.files['file']. I want to do the processing on the image without saving as I will be storing it in Azure Blob.

Goals

  • Monochrom image (Binarize)
  • Image compression
  • Preserve aspect ratio
Peter Pan
  • 23,476
  • 4
  • 25
  • 43
Harvey
  • 184
  • 1
  • 3
  • 15

3 Answers3

0

How about Pillow? Its a general purpose image utilities module, specifically with Image.resize you can do what you want.

reptilicus
  • 10,290
  • 6
  • 55
  • 79
  • I am concerned about preserving the image aspect ratio. Also I am reading the upload file by `file = request.files['file']` I would like to know how I can proceed with that. – Harvey Jan 07 '16 at 19:00
0

Image processing is complex and usually time-consuming operation. Executing the "available scripts" or Pillow(as @reptilicus suggested) in your requests' handlers is not good idea, because they will take resources for long time and will limit you application performance. But you can set up a Celery instance and fire tasks for image processing in background. When an image processing task is finished you can start another task to upload the resized image to Azure Blob. You will be able to retry tasks and doing a lot more. Similar application set up will give robustness and scalability.

nikihub
  • 311
  • 1
  • 5
0

On Azure, the usual way for processing images is that use WebJob combined with Azure Storage Queue.

You can try to firstly store the upload images into Azure Storage Queue, and then retrieve the images from the Storage Queue to process one by one using WebJob as a background task, and store the processed images into Azure Storage Blob.

As reference, you can see the docs Run Background tasks with WebJobs and How to use Queue storage from Python to know them.

Hope it helps. Any concern, please feel free to let me know.

Peter Pan
  • 23,476
  • 4
  • 25
  • 43