9

I've created an API in Laravel, that allows users to upload zip archives that contain images.

Once an archive is uploaded it's sent to S3 and then picked up by another service to be processed.

I'm finding that with larger archives PHP keeps hitting its memory limit. I know I could raise the limit but that feels like a slippery slope, especially as I imagine multiple users uploading large files.

My current solution has been to completely forego my server and allow the client to upload directly to S3. But this feels very insecure and susceptible to spamming/DDOSing.

I guess what I'm really hoping for is a discussion about how this could be handled elegantly.

Is there a language more suitable for this sort of processing/concurrency? I could easily spawn the uploading process out to something else.

Are my issues about S3 unfounded? I know ever request needs to be signed but the tokens generated are reusable, so they're exploitable.

Resources online speak about NGINX as a better solution, as it has an upload module that write uploads directly to file, as apache appears to be trying to do a lot in memory (not 100% sure about this).

I'm pretty unclear about the whole PHP upload process if I'm honest. Is a request stored directly in memory? i.e. Ten 50mb uploads would cause a memory limit exception against my 500mb of RAM

user3750194
  • 387
  • 2
  • 11
  • 1
    There is a good discussion on this http://stackoverflow.com/questions/864570/very-large-uploads-with-php – Victory Jul 29 '15 at 23:08
  • 1
    Uploaded files are stored on the hard drive, by default (e.g. as a file in a temp folder). I'm not sure about the rest. Good question, though! – Cully Jul 29 '15 at 23:08
  • 1
    This might be another good resource: http://stackoverflow.com/questions/12609451/forward-a-file-upload-stream-to-s3-through-iteratee-with-play2-scala – Cully Jul 29 '15 at 23:09
  • 1
    This too: http://blog.tcs.de/post-file-to-s3-using-node/ – Cully Jul 29 '15 at 23:10
  • Thanks @CullyLarson Those links were very helpful. I've been using Scala professional for the last couple months so it might be a solution. I've pretty much implemented that exact node app but in php. My concern is an authenticated user can get as many policy/signatures as they want. So this may be susceptible to spamming. I imagine a solution to this might be extremely short lived aws credentials and rate limit this PHP side. There's no reason why a user should be requesting 100 aws credentials per second! But it still didn't feel as air tight as pushing their files through my own services. – user3750194 Jul 30 '15 at 20:10
  • Another alternative solution could be to use a queue. Laravel supports plenty out of the box, I personally use Beanstalkd. For each uploaded file, add an upload to be handling by one of the workers. Depending on the number of workers you have, you may still have a large overall memory usage, but you shouldn't hit the per-process memory limit (depending on the size of the file. Chunking files altogether is another battle.) – SArnab Jul 30 '15 at 21:06
  • @SArnab thanks for the suggestion but the problem isn't processing! I've already passed this off to queues and also more recently AWS's Lambda functions. The issue is more uploading the files to the server! Laravel's throwing a memory limit exception either on the actual upload or when moving the files to S3. – user3750194 Jul 30 '15 at 21:43
  • 1
    Hmm, interesting situation. Based on what I know of S3, during a PUT stream operation, the contents of the file are buffered into a temporary stream.(http://docs.aws.amazon.com/aws-sdk-php/v2/guide/feature-s3-stream-wrapper.html) - So that definitely could eat up the memory limit. But to clarify, when uploading a file it is not stored in the RAM belonging to the PHP process. It is independent of the memory_limit setting. The settings pertaining to upload are "max_file_size" and "upload_max_filesize". The issue is likely with the move to S3. – SArnab Jul 30 '15 at 23:23
  • Great point. In looking into this now, I've realised the version of the AWS SDK I was running was a major version behind so this issue could be a memory leak that might not be present in newer version. I'm also going to update Laravel while I'm at it. – user3750194 Jul 30 '15 at 23:46

1 Answers1

0

Thanks for the discussion everyone. After looking into PHP Post/Upload process, it cleared up how things worked a little.

Updating the SDK appeared to eliminate those initial memory limit issues.

Of course I'm still looking into the issue of concurrency, but I feel like this is more of an apache/nginx/server config/spec optimisation issue than my language.

Thanks everyone!

user3750194
  • 387
  • 2
  • 11