9

I am trying to create a transcoding function for short videos. The function is hosted on AWS Lambda. The problem is that AWS lambda seems to be missing something that FFMPEG requires, at least according to Amazon.

I contacted Amazon earlier and this is their response to the issue:

We found that the FFMPEG operations require at least libx264 and an acc library, both of which will have dependencies of their own. To troubleshoot the issue it will involve diving deeper into the full dependency chain. We can see that it works in the Amazon Linux environment however, the environment is similar but not identical to the lambda environment. There can be some dependencies that exist in Amazon Linux but not in lambda environment as Lambda runs on the container. Here, as FFmpeg is a third party software, diving deeper into the dependency chain and verifying the version compatibilities is very hard to do. Unfortunately going further, this is bound to go into architecture and code support which is out of AWS Support scope 1. I hope you understand our limitations. However should FFmpeg support have any questions specific to the Lambda platform, please do let us know and we will be happy to assist. We will be in better position to investigate further once you receive an update from the FFmpeg support suggesting an issue from Lambda end.

Upon AWS suggestion, I contacted FFMPEG on the developers mailing list, my message was rejected with the reason being that its more suited to ffmpeg users mailing list than developers. I sent an email to 'ffmpeg-user@ffmpeg.org' a week ago and did not get any response yet.

I then went and built a dynamically linked ffmpeg version making sure to package all libraries, checked ddl on each one, then made a small lambda function that looped over all binaries and ddled each one of them, compared that to the output I got from Amazon Linux and the same dependencies/versions exists on both lambda and the AWS Linux instance yet ffmpeg still fails on lambda.

You can find a detailed log file here: https://www.datafilehost.com/d/6e5e21bb

And this is a sample of the errors I'm getting, repeated across the entire log file:

2018-08-14T12:27:10.874Z [h264 @ 0x65c2fc0] concealing 2628 DC, 2628 AC, 2628 MV errors in P frame

2018-08-14T12:27:10.874Z [aac @ 0x65d2f00] channel element 2.11 is not allocated

2018-08-14T12:27:10.874Z Error while decoding stream #0:1: Invalid data found when processing input

2018-08-14T12:27:10.874Z [h264 @ 0x67e86c0] Invalid NAL unit size (108085662 > 1649).

2018-08-14T12:27:10.874Z [h264 @ 0x67e86c0] Error splitting the input into NAL units.

2018-08-14T12:27:10.874Z [aac @ 0x65d2f00] channel element 2.0 is not allocated

2018-08-14T12:27:10.874Z Error while decoding stream #0:1: Invalid data found when processing input

2018-08-14T12:27:10.874Z [h264 @ 0x68189c0] Invalid NAL unit size (71106974 > 1085).

2018-08-14T12:27:10.874Z [h264 @ 0x68189c0] Error splitting the input into NAL units.

2018-08-14T12:27:10.874Z [aac @ 0x65d2f00] Pulse tool not allowed in eight short sequence.

This log is generated when trying to perform an HLS transcoding on this file: https://www.datafilehost.com/d/999a4492

Note that the issue is not related to that file alone nor is it related to HLS, its general and happen on all videos and any ffmpeg command that tries to seek the stream, even tried extracting a single frame from a video using the simplest form possible for example: ffmpeg -ss 00:00:02 -I file.mp4 -vframes 1 -y output.jpg also fails with the same errors in the log file.

Not sure how to debug this further. Tried enabling debug logs with ‘-loglevel debug’ but did not give me any extra info. Any help or suggestions

Zaid Amir
  • 4,727
  • 6
  • 52
  • 101
  • Does it run fine when you don't seek? – Gyan Sep 10 '18 at 07:32
  • 2
    Also, you did receive a reply on ffmpeg-user, but you haven't followed up: https://lists.ffmpeg.org/pipermail/ffmpeg-user/2018-September/041124.html – Gyan Sep 10 '18 at 07:34
  • @Gyan The only thing that runs fine is calling `ffmpeg -i file.mp4` anything that tries to read the video fails – Zaid Amir Sep 10 '18 at 08:10
  • How is the source file actually getting into the Lambda container? Is it passed in the invocation request payload? If so, you may be corrupting it unless you are passing it as base64. The Lambda API is JSON, which does not support arbitrary binary data, and anything in the payload that isn't valid utf-8 will almost certainly be coerced to utf-8 with substitution of � (U+FFFD/hex 0xEF 0xBF 0xBD) in place of bytes that aren't actual valid characters. Using base64 for binary payload works around this. The errors give the impression of corrupt data, rather than a dependency issue. – Michael - sqlbot Sep 10 '18 at 09:20
  • @Michael-sqlbot the file is downloaded and stored on lambda's disk. At first I thought the same thing yet when calculating the MD5 of the downloaded file its correct. In addition to MD5 I tried re-uploading the file to S3 after downloading it and the binary was retained. – Zaid Amir Sep 10 '18 at 13:12
  • I'm having the exact same issue here. – derekhh Feb 18 '19 at 08:33

3 Answers3

5

I've run into the exact same issue today and spent hours. But finally I came across this SO answer and found the solution.

Basically you need to make sure you are not passing STDIN to the FFmpeg process. It's mentioned in the re:Invent talk on this slide.

derekhh
  • 5,272
  • 11
  • 40
  • 60
0

If you have trouble with ffmpeg then try avconv instead. avconv is a fork of ffmpeg and can be called in the same way. I had the same problems as you did with ffmpeg when trying to decode an aac audio stream within the lambda environment but a static build of avconv by John Van Sickle worked as expected for me.

Also, make sure to assign enough RAM to your lambda function. The static ffmpeg binaries are big and encoding needs a lot of RAM on top of that, especially if you are encoding videos.

faxi
  • 1
  • 1
  • Maybe your ffmpeg was ancient. – llogan Nov 26 '18 at 19:04
  • @LordNeckbeard: I used ffmpeg 4.1 which was released less than a month ago (static build by John van Sickle) and which works perfectly on my local debian machine. – faxi Nov 27 '18 at 20:11
  • So 4.1 worked fine with the same input file on your local debian machine, but the same version did not work on lambda? It would be nice to get to the bottom of this. – llogan Nov 27 '18 at 21:04
  • Yes, indeed. With the same static binary of ffmpeg 4.1 and the same input file the conversion from aac to wav fails on Lambda but works on my local Debian machine. And if I replace ffmpeg with a static build of avconfig 12.1 (again static build by John van Sickle) then I have no problem whatsoever in Lambda. I am also wondering how we can understand what is going wrong there. – faxi Nov 29 '18 at 23:09
0

Have you tried using a statically compiled ffmpeg?

Here is what worked for me:

  1. Grabbed a static build of ffmpeg from https://johnvansickle.com/ffmpeg/releases/ffmpeg-release-64bit-static.tar.xz
  2. Extracted and discarded everything except ffmpeg binary itself. chmod this binary to allow execution and placed into bin/ffmpeg inside deployment folder(in my case the folder will be zipped and deployed by zappa)
  3. Copied the ffmpeg binary inside /tmp. (I got operation not permitted error if executing ffmpeg from /var/task/bin, my knowledge of ffmpeg itself is limited but my assumptions is that ffmpeg will create temporary files during processing and since lambda has a read-only file-system, this can be only done inside /tmp, in my case I am converting caf to mp4)
  4. Added /tmp to search paths so ffmpeg will be available at path. (needed when using some 3rd-parties that that incapsulate ffmpeg calls inside like pydub)

Here is relevant python code, since lambda containers will be reused, /tmp might already contained copied ffmpeg.

from os import environ, path
from shutil import copy2

# check if we have ffmpeg inside /tmp, if we do, no need to copy
# otherwise copy ffmpeg from /var/task/bin to /tmp
if environ.get('AWS_EXECUTION_ENV', '').startswith('AWS_Lambda_') and not path.isfile('/tmp/ffmpeg'):
    copy2(path.join(PROJECT_ROOT, 'bin/ffmpeg'), '/tmp/ffmpeg')

# add /tmp to search paths if it's not there
# so ffmpeg executed from pydub will be found
custom_deps_bin_path = '/tmp/'
if environ.get('AWS_EXECUTION_ENV', '').startswith('AWS_Lambda_') and custom_deps_bin_path not in environ['PATH']:
    environ['PATH'] += ":" + custom_deps_bin_path

Also there are few relevant third parties for this:

  1. binoculars/aws-lambda-ffmpeg
  2. ubergarm/zappa-ffmpeg
ambientlight
  • 7,212
  • 3
  • 49
  • 61