1

I have a use case where I have to filter incoming data from Kinesis Firehose based on the type of the event. I should write only certain events to S3 and ignore the rest of the events. I am using lambda to filter the records. I am using following python code to achieve this:

def lambda_handler(event, context):
    # TODO implement
    output = []
    
    for record in event['records']:
        payload = base64.b64decode(record["data"])
        payload_json = json.loads(payload)
        event_type = payload_json["eventPayload"]["operation"]
        
        if event_type == "create" or  event_type == "update":
            output_record = {
            'recordId': record['recordId'],
            'result': 'Ok',
            'data': base64.b64encode(payload)}
            output.append(output_record)
        else:
            output_record = {
            'recordId': record['recordId'],
            'result': 'Dropped'}
            output.append(output_record)
        
        return {'records': output}

I am only trying to process "create" and "update" events and dropping the rest of the events. I got the sample code from AWS docs and built it from there.

This is giving the following error:

{"attemptsMade":1,"arrivalTimestamp":1653289182740,"errorCode":"Lambda.MissingRecordId","errorMessage":"One or more record Ids were not returned. Ensure that the Lambda function returns all received record Ids.","attemptEndingTimestamp":1653289231611,"rawData":"some data","lambdaArn":"arn:$LATEST"}

I am not able to get what this error means and how to fix it.

Harish J
  • 146
  • 1
  • 3
  • 12

1 Answers1

0
  1. Bug: The return statement needs to be outside of the for loop. This is the cause of the error. The function is processing multiple recordIds, but only 1 recordId is returned. Unindent the return statement.
  2. The data key must be included in output_record, even if the event is being dropped. You can base64 encode the original payload with no transformations.

Additional context: event['records'] and output must be the same length (length validation). Each dictionary in output must have a recordId key whose value equals a recordId value in a dictionary in event['record'] (recordId validation).

From AWS documentation:

The record ID is passed from Kinesis Data Firehose to Lambda during the invocation. The transformed record must contain the same record ID. Any mismatch between the ID of the original record and the ID of the transformed record is treated as a data transformation failure.

Reference: Amazon Kinesis Data Firehose Data Transformation

Andrew Nguonly
  • 2,258
  • 1
  • 17
  • 23