0

firstly I am relatively new to code and attempting to teach myself what I need! I have managed to butcher bits of example code that I have found on various forums to get to where I am now. I am running an AWS Lambda function that triggers when a new file is uploaded to a bucket, and then sends the file off to MediaInfo (I built a self contained CLI executable that is uploaded to the Lambda function) the result of this is in XML format, and I have managed to pass this onto a DynamoDB database.

My question is - I want to export the XML produced by this function and push it to an SNS topic so that I can pick it up and use elsewhere (knack database). Here is my Lambda code in full (changed private info).

import logging
import subprocess

import boto3

SIGNED_URL_EXPIRATION = 300     # The number of seconds that the Signed 
URL is valid
DYNAMODB_TABLE_NAME = "demo_metadata"
DYNAMO = boto3.resource("dynamodb")
TABLE = DYNAMO.Table(DYNAMODB_TABLE_NAME)

logger = logging.getLogger('boto3')
logger.setLevel(logging.INFO)


def lambda_handler(event, context):
    """
:param event:
    :param context:
    """
    # Loop through records provided by S3 Event trigger
    for s3_record in event['Records']:
        logger.info("Working on new s3_record...")
        # Extract the Key and Bucket names for the asset uploaded to S3
        key = s3_record['s3']['object']['key']
        bucket = s3_record['s3']['bucket']['name']
        logger.info("Bucket: {} \t Key: {}".format(bucket, key))
        # Generate a signed URL for the uploaded asset
        signed_url = get_signed_url(SIGNED_URL_EXPIRATION, bucket, key)
        logger.info("Signed URL: {}".format(signed_url))
        # Launch MediaInfo
        # Pass the signed URL of the uploaded asset to MediaInfo as an 
input
        # MediaInfo will extract the technical metadata from the asset
        # The extracted metadata will be outputted in XML format and
        # stored in the variable xml_output
        xml_output = subprocess.check_output(["./mediainfo", "--full", "--output=XML", signed_url])
        logger.info("Output: {}".format(xml_output))
        save_record(key, xml_output)


def save_record(key, xml_output):
    """
    Save record to DynamoDB

    :param key:         S3 Key Name
    :param xml_output:  Technical Metadata in XML Format
    :return:            xml_output
    """
    logger.info("Saving record to DynamoDB...")
    TABLE.put_item(
       Item={
            'keyName': key,
            'technicalMetadata': xml_output
        }
    )
    logger.info("Saved record to DynamoDB")


def get_signed_url(expires_in, bucket, obj):
    """
    Generate a signed URL
    :param expires_in:  URL Expiration time in seconds
    :param bucket:
    :param obj:         S3 Key name
    :return:            Signed URL
    """
    s3_cli = boto3.client("s3")
    presigned_url = s3_cli.generate_presigned_url('get_object', Params= {'Bucket': bucket, 'Key': obj},
                                              ExpiresIn=expires_in)
    return presigned_url

The output I get from the Lambda function when using the aws GUI is here, and this is what I went to send to an SNS topic.

<Height>1080</Height>
<Height>1 080 pixels</Height>
<Stored_Height>1088</Stored_Height>
<Sampled_Width>1920</Sampled_Width>
<Sampled_Height>1080</Sampled_Height>
<Pixel_aspect_ratio>1.000</Pixel_aspect_ratio>
<Display_aspect_ratio>1.778</Display_aspect_ratio>
<Display_aspect_ratio>16:9</Display_aspect_ratio>
<Rotation>0.000</Rotation>
<Frame_rate_mode>CFR</Frame_rate_mode>
<Frame_rate_mode>Constant</Frame_rate_mode>
<Frame_rate>29.970</Frame_rate>
<Frame_rate>29.970 (30000/1001) fps</Frame_rate>
<FrameRate_Num>30000</FrameRate_Num>
<FrameRate_Den>1001</FrameRate_Den>
<Frame_count>630</Frame_count>
<Resolution>8</Resolution>
<Resolution>8 bits</Resolution>
<Colorimetry>4:2:0</Colorimetry>
<Color_space>YUV</Color_space>
<Chroma_subsampling>4:2:0</Chroma_subsampling>
<Chroma_subsampling>4:2:0</Chroma_subsampling>
<Bit_depth>8</Bit_depth>
<Bit_depth>8 bits</Bit_depth>
<Scan_type>Progressive</Scan_type>
<Scan_type>Progressive</Scan_type>
<Interlacement>PPF</Interlacement>
<Interlacement>Progressive</Interlacement>
<Bits__Pixel_Frame_>0.129</Bits__Pixel_Frame_>
<Stream_size>21374449</Stream_size>
<Stream_size>20.4 MiB (99%)</Stream_size>
<Stream_size>20 MiB</Stream_size>
<Stream_size>20 MiB</Stream_size>
<Stream_size>20.4 MiB</Stream_size>
<Stream_size>20.38 MiB</Stream_size>
<Stream_size>20.4 MiB (99%)</Stream_size>
<Proportion_of_this_stream>0.98750</Proportion_of_this_stream>
<Encoded_date>UTC 2017-11-24 19:29:16</Encoded_date>
<Tagged_date>UTC 2017-11-24 19:29:16</Tagged_date>
<Buffer_size>16000000</Buffer_size>
<Color_range>Limited</Color_range>
<colour_description_present>Yes</colour_description_present>
<Color_primaries>BT.709</Color_primaries>
<Transfer_characteristics>BT.709</Transfer_characteristics>
<Matrix_coefficients>BT.709</Matrix_coefficients>
</track>

<track type="Audio">
<Count>272</Count>
<Count_of_stream_of_this_kind>1</Count_of_stream_of_this_kind>
<Kind_of_stream>Audio</Kind_of_stream>
<Kind_of_stream>Audio</Kind_of_stream>
<Stream_identifier>0</Stream_identifier>
<StreamOrder>1</StreamOrder>
<ID>2</ID>
<ID>2</ID>
<Format>AAC</Format>
<Format_Info>Advanced Audio Codec</Format_Info>
<Commercial_name>AAC</Commercial_name>
<Format_profile>LC</Format_profile>
<Codec_ID>40</Codec_ID>
<Codec>AAC LC</Codec>
<Codec>AAC LC</Codec>
<Codec_Family>AAC</Codec_Family>


</File>
</Mediainfo>


[INFO]  2018-04-22T18:50:01.803Z    efde8294-465d-11e8-9ad2-0db0d6b36746    Saving record to DynamoDB...
[INFO]  2018-04-22T18:50:02.21Z efde8294-465d-11e8-9ad2-0db0d6b36746     
Saved record to DynamoDB
END RequestId: efde8294-465d-11e8-9ad2-0db0d6b36746
REPORT RequestId: efde8294-465d-11e8-9ad2-0db0d6b36746  Duration: 9769.02 ms    Billed Duration: 9800 ms    Memory Size: 128 MB Max Memory Used: 61 MB  

Many thanks in advance to anyone with advice!

  • Why not just add some code to the lambda which sends the message to SNS? http://boto3.readthedocs.io/en/latest/reference/services/sns.html#SNS.Client.publish You can send the message and still output the result from the lambda function if both are important. Alternately you could do it in multiple steps by having two lambda functions in a step function: https://aws.amazon.com/step-functions/, but the former idea is easier – dpwr Apr 22 '18 at 19:21
  • Thank you!! This worked like a charm - now I just need to figure out how to parse the XML output into JSON so I can use it... – Richard Clarke Apr 23 '18 at 22:49

0 Answers0