4

My work would require me to encode a few thousand movies in a few days. Each movie needs to be encoded in 3 different formats. I use ffmpeg to output these formats in parallel with a single read of the input source as detailed here: http://ffmpeg.org/trac/ffmpeg/wiki/Creating%20multiple%20outputs

In addition, I am using GNU Parallel to encode from multiple video files in parallel. We have four blade servers of different configurations (48, 32, 16 and 16 cores) encoding videos in parallel. Ideally, we should be able to encode 112 videos in parallel.

However, it seems that encoding completes faster on machines with lesser cores. I have 16 completed encodes on the 16 core servers in around 4 hours, while it takes close to 10 hours for 48 encodes to complete on the 48 core system. What could be the bottleneck? A typical encode command is as follows:

ffmpeg -i sample.mpg -y -vcodec libx264 -vprofile baseline -level 30 -acodec libfdk_aac -ab 128k -ac 2 -b:v 500K -threads 1  encoded/sample_enc.mp4

Any pointers highly appreciated. Thanks!

souvik
  • 51
  • 3
  • Storage could be a bottleneck. It looks like some sort of a shared storage you are using. Is it an NFS share? Look at the CPU load structure on the blades to check if processes are waiting for IO or busy waiting for some other OS resource. – Dima Chubarov Aug 16 '13 at 06:18
  • Indeed, my guess is storage IS a bottleneck. Unfortunately, I do not have the time to copy over content from the portable drives I receive. Copying of TBs of data itself would take a while. I simply attach them to the servers and start encoding. Right now, I am trying out alexbuisson's suggestion. Will update in a couple of hours. – souvik Aug 16 '13 at 06:41

1 Answers1

1

1 encoding split over several n core is ok but n encoding in parallel 1 per core would cause a bandwidth saturation ...

try to activate 4 thread for each transcoder to speed the throughput of 1 video and limit to 2 or 3 the number of video you encode in parallel but that depend of the memory you have , of your bandwidth and of the video (SD vs HD for example)

alexbuisson
  • 7,699
  • 3
  • 31
  • 44
  • Thanks! Will try this out right away and post the results here. Any rule of thumb for the optimal encodes in parallel and the number of threads each encode uses, or do I need to hit upon the correct figure by trial and error? Really appreciate your help. – souvik Aug 16 '13 at 05:29
  • If a rule exist is must be a complex one, as there is a lot of thesis on the distributed encoding .... – alexbuisson Aug 17 '13 at 16:13
  • I bet, it would be complex. Well, I followed your advice and it has worked wonders so far. I am now using ffmpeg with a '-threads 4' option and using gnu parallel with a '-j 25%' option and am able to churn out 4 complete encodes (all three formats) on 16 cores every hour. Thanks! – souvik Aug 18 '13 at 17:37
  • And as Dmitri suggested, IO could quickly become a bottleneck since I observed that with a 32 or a 48 core system, with the above parameters and encoding from the same source disk lowers the average - I do not get 8 or 12 complete encodes every hour, but it takes longer. However, it is anyway faster than the old scheme of things I initially posted about. – souvik Aug 18 '13 at 17:41
  • Thank for the news, look like you have all to monitor and optimize your system. Good luck! – alexbuisson Aug 18 '13 at 19:07