1

I'm having some issues publishing the logs from CloudWatch -> Kinesis Stream -> Kinesis Delivery Stream -> Transformation Lambda -> AWS OpenSearch. While the documentation is straightforward i struggle with the transformation lambda and insertion to OpenSearch.

The error i'm getting is the following:

Multiple records were returned with the same record Id. Ensure that the Lambda function returns a unique record Id for each record.

what i do in my lambda is the following:

public DataTransformationLambda.TransformedResult handleRequest(final KinesisFirehoseEvent kinesisEvent, final Context context) {
        context.getLogger().log(String.format("Date received: %s", kinesisEvent));

        final DataTransformationLambda.TransformedResult transformedResult = new DataTransformationLambda.TransformedResult();
        final List<DataTransformationLambda.TransformedResult.Record> records = new ArrayList<>();
        for (final KinesisFirehoseEvent.Record record : kinesisEvent.getRecords()) {
            try {
                final byte[] payload = record.getData().array();
                context.getLogger().log(String.format("Processing payload: %s", payload));
                final ByteArrayInputStream byteArrayInputStream = new ByteArrayInputStream(payload);
                final GZIPInputStream gzipInputStream = new GZIPInputStream(byteArrayInputStream);
                final InputStreamReader inputStreamReader = new InputStreamReader(gzipInputStream, StandardCharsets.UTF_8);
                final BufferedReader bufferedReader = new BufferedReader(inputStreamReader);
                String line;
                ArrayNode nodes = objectMapper.createArrayNode();


                while ((line = bufferedReader.readLine()) != null) {
                    String decodedData = new String(line.getBytes(), StandardCharsets.UTF_8);
                    context.getLogger().log(String.format("Decoded data: %s", decodedData));
                    JsonNode jsonNode = DataTransformationLambda.objectMapper.readTree(line);
                    ArrayNode logEvents = (ArrayNode) jsonNode.get("logEvents");

                    logEvents.forEach(logEvent -> {
                        try {
                            DataTransformationLambda.TransformedResult.Record newRecord = new DataTransformationLambda.TransformedResult.Record();
                            newRecord.setRecordId(record.getRecordId());
                            String logEventId = logEvent.get("id").asText();
                            long timestamp = logEvent.get("timestamp").asLong();
                            String message = logEvent.get("message").asText();
                            ObjectNode objectNode = objectMapper.createObjectNode();
                            Date date = new Date(timestamp);
                            objectNode.put("logEventId", logEventId);
                            objectNode.put("timestamp", timestamp);
                            objectNode.put("message", message);
                            String objectNodeString = objectNode.toString();

                            String encodedNode = Base64.getEncoder().encodeToString(objectNodeString.getBytes(StandardCharsets.UTF_8));
                            newRecord.setData(encodedNode);
                            newRecord.setResult("Ok");
                            records.add(newRecord);
                        } catch (Exception e) {
                            e.printStackTrace();
                        }
                    });
                }
            }
            catch (Exception e2) {
                e2.printStackTrace();
                context.getLogger().log(String.format("Error while processing: %s", e2.getMessage()));
            }
        }
        context.getLogger().log(String.format("Records: %s", records));
        transformedResult.setRecords(records);
        return transformedResult;
    }

While i know that there can be only one unique recordId and those values need to match, but i need all of the logEvents to be queriable from AWS OpenSearch by timestamp. Is there a way doing it the way i try to do it?. Currently i haven't found any. The json content i get from the Firehose Event is:

{
    "owner": "111111111111",
    "logGroup": "CloudTrail/logs",
    "logStream": "111111111111_CloudTrail/logs_us-east-1",
    "subscriptionFilters": [
        "Destination"
    ],
    "messageType": "DATA_MESSAGE",
    "logEvents": [
        {
            "id": "31953106606966983378809025079804211143289615424298221568",
            "timestamp": 1432826855000,
            "message": "{\"eventVersion\":\"1.03\",\"userIdentity\":{\"type\":\"Root\"}"
        },
        {
            "id": "31953106606966983378809025079804211143289615424298221569",
            "timestamp": 1432826855000,
            "message": "{\"eventVersion\":\"1.03\",\"userIdentity\":{\"type\":\"Root\"}"
        },
        {
            "id": "31953106606966983378809025079804211143289615424298221570",
            "timestamp": 1432826855000,
            "message": "{\"eventVersion\":\"1.03\",\"userIdentity\":{\"type\":\"Root\"}"
        }
    ]
}

and i need all of the logEvents to be separate message, at least they need to apear like that in OpenSearch

rollercoaster
  • 15
  • 1
  • 2

0 Answers0