-1

I'm working on a flask backend app. There are some profile images coming in from the android frontend to the flask API endpoint. I want to store these images.

Tech stack : Android app, Flask API/backend, Postgres, AWS services.

What would be the best idea?

I thought of the following ideas. Do let me know if any of these ideas make sense!

  1. Storing the images directly in the Postgres database. ( i think this is bad as it will put a load on the database).

  2. Storing the images in Amazon S3 buckets and S3 file paths as one of the values in the Postgres table.

An improvement for 2 i thought would be - Have the arrangement like 2, and have a CDN such as Amazon Cloudfront for CDN and faster distribution to other services requesting the images.

How does it sound? Any other ideas?

Thanks! :)

coffee_is_bae
  • 23
  • 1
  • 5
  • Option 2 is the more common. Storing the images as binary blobs in the database can become a major bottleneck, since every request goes through flask. A CDN can serve files with much much better performance. – Håken Lid Feb 24 '18 at 00:39

1 Answers1

1

Option 2 is a commonly used approach, but depending on how it is used, the lo type in Postgres for binary large objects could also be OK. Primarily, this is about the sort of throughput you need for writing the images, and what sort of request interface is used for other applications that retrieve the images (e.g. serializing and deserializing blob images through a Python web framework would not be a good idea, but for certain use cases, doing something like a gRPC server might be fine this way).

For option 2, it's a good idea to think about how you can create some type of image ID for the images, that will be robust to changes in file paths or names, format changes, storage bucket changes, etc. It can be a pain, but getting something set up where all these pesky aspects about image specifiers are properly normalized is worth it.

It's also worth it to think about whether you need a range of sizes or post-processing outputs, like storing 'thumbnail', 'original size' and 'small' versions of the images. It can be a huge pain to backfill this stuff if it's not laid out in a sane way up front.

So much of this depends on your actual use case. How will the data set grow? What sort of latency is required when people requests image data? Will you ever possibly store other types of assets besides images? What sort of throughput demands are there, both for ingestion and for serving the images later.

ely
  • 74,674
  • 34
  • 147
  • 228