1) You need to split web server and application server.
First of all do not try to stream media files from your backend unless you can offload low-level stuff to OS - most likely you will do it wrong.
Use proxy server as an web server to serve such files.
nginx will do.
Also you need to have backup of your media files the same way as you do backup of your database.
Storing static huge media files along with application server is wrong move - it will not scale at all.
You can add cron task to move files to some CDN server - when your move is complete you replace URL in database to match new location.
So by using nginx you will save precious CPU and RAM while file is getting moved to external server.
And CDN will help you to dedicate bandwidth and CPU/RAM resources to application server.
2) Regarding image resolution and downsampling:
Screens of modern handsets have the same or even better resolution compared to typical office workstation.
Link speeds have much bigger impact on UX.
If client has smartphone with huge screen but with slow link you still have to deliver image or video as fast as possible even if quality of media will not be match the resolution of handset.
It makes sense to downsample images on demand and store result on disk for nginx/CDN to serve it again.
In case of videos it makes sense to make "bad" version with big compression(quality loss) for the cases of slow link - device will downsample it itself during playback.
And you can keep client statistics (screen sizes/downlink speeds) and generate optimized versions of such video file later when you see that it is "popular".
FYI: Several years ago some social meda giant dropped idea to prepare all possible versions of the same media file in favour of FPGA on-the-fly resampler.
I do not remember the name of the company and URL to the article. It was probably instagram.
Some cloud providers have offers with FPGA or CUDA on board to do heavy lifting.
So in some cases you could exchange storage for heave horsepower to do conversion on the fly.