16

What do people think are the most important issues when developing an application that is going to allow users to upload video and images to a server and have them transcoded by FFMPEG and stored in amazon S3? I have a couple of options;

1) install FFMPEG on the same server that handles file uploads, when a video is uploaded and stored on EC2 instance, call FFMPEG to convert it then when done, write the file to S3 bucket and dispose of the original.

How scalable is this? What happens when many users upload at the same time? How do I manage multiple processes at once? How do I know when to start another instance and load balance this configuration?

2) Have one server for processing uploads (updating database, renaming files etc) and one server for doing transcoding. Again what is the best way to manage multiple processes? should I be looking at Amazon SQS for this? Can I tell the transcoding server to get the file from the upload server or should I copy the file to the transcoding server? Should I just store all files on S3 and SQS can read from there. I am trying to have as little traffic as possible.

I am running a linux box as the upload server and have FFMPEG running on this.

Any advice on best practices for setting up such a configuration would be appreciated. Many thanks

undefined
  • 5,190
  • 11
  • 56
  • 90
  • 5
    Amazon Web Services recently released a new web service called [Amazon Elastic Transcoder](http://aws.amazon.com/elastictranscoder "Amazon Elastic Transcoder"). – Adam Feb 01 '13 at 15:32

4 Answers4

13

I don't think you'll want to start a new FFMPEG instance every time someone uploads a file for transcoding. Instead, you'll probably want to start the same number of FFMPEG processes as the number of CPUs you have, then queue up the input files you want to transcode and do them in the order they were received. You could do this all on one computer, I don't think the server that accepts the uploads and puts them in the queue will need take much CPU and can probably coexist just fine with the FFMPEG processes.

Depending on how big you want to scale to (if you want to do more than just a few FFMPEG processes on a single machine) you could easily make this distributed, and this is where SQS would come in handy. You could run 1 FFMPEG process per core, and instead of looking in a local queue for the data, it could look to the SQS. Then you could instantiate as many transcoding processes as you need, on different machines.

The downside to this, is that you will need to transfer the raw videos from the server that accepts them to the server that needs to transcode them. You could put them in S3 then grab them out of S3, but I don't remember off the top of my head if you have to pay for that. Alternatively, you could just keep them on the hard disk of the machine that received them, and have the transcoding process go there to get the raw files.

teeks99
  • 3,585
  • 2
  • 30
  • 38
  • Great thanks for your response. I am now at the stage where I have the server that handles the uploads calling FFMPEG to process the uploaded video, then write the encoded file to Amazon S3. While this is happening though the scripts wait until all processes have finished, ie the user has to wait for the video to encode before the next video uploads etc. I agree with you that I can probably manage the uploading and encoding on a single machine but how do you suggest I run the transcoding in the background, and how do I detect when a file has been transcoded to copy it to S3? thanks again – undefined Oct 02 '09 at 15:12
  • So you have a process that is doing the transcoding, can't the same process just put it in S3 when it is done with it? Maybe when the web-facing app kicks off the transcoding process, it can pass in an argument that tells the transcoding process where in S3 to put it. – teeks99 Oct 03 '09 at 13:18
  • Remember that FFMPEG can take in data from STDIN and output to STDOUT. Don't forget to look at all the streaming command line options available! – jduncanator Jan 05 '14 at 07:20
2

You can check out Piper. It's an open source version of a product I originally built for a huge entertainment company to handle their video transcoding at scale.

okrunner
  • 3,083
  • 29
  • 22
1

You should have a look at Amazon Elastic Transcoder. It solves almost all of the problems you have mentioned in the question.

1

There is actually many methods than you can use to solve your problem:

1-Using ec2 cron jobs, you can run a simple php script that will check your database (e.g, every 30s) if there is any new video available for transcoding (you can use a simple DB attribute for this, processed: Boolean)

2-Using aws Lambda service to detect any new video uploaded to your s3 bucket, trigger lambda function for getting thumbs & transcoding, send the output to your target bucket. Check this greate tool by @binoculars requires some js & gulp understanding , but it's very handy & smooth.

3-Using aws transcoder. It's pretty much expensive. If you're getting rounded up to the nearest minute, that's a huge cost when your videos are short. If you're Netflix or Amazon running long jobs to transcode movies, ET makes a lot more sense.

Nourdine Alouane
  • 804
  • 12
  • 22