0

I'm using the AWS CLI to get some Kinesis metrics - part of that I'm able to specify the output format as one of the below: https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html#cli-quick-configuration-format

Output Format

The Default output format specifies how the results are formatted. The value can be any of the values in the following list. If you don't specify an output format, json is used as the default.

json – The output is formatted as a JSON

string.

yaml – The output is formatted as a YAML

string. (Available in the AWS CLI version 2 only.)

text – The output is formatted as multiple lines of tab-separated string values. This can be useful to pass the output to a text processor, like grep, sed, or awk.

table – The output is formatted as a table using the characters +|- to form the cell borders. It typically presents the information in a "human-friendly" format that is much easier to read than the others, but not as programmatically useful.

I've tried TEXT as that seems the most reasonable for splunk but I think the line separated data is messing up splunks ingest:

METRICDATARESULTS   iteratorAgeMilliseconds itagemillis PartialData
METRICDATARESULTS   readProvisionedThroughputExceeded   itagemillis PartialData
TIMESTAMPS  2020-04-15T20:21:00+00:00
TIMESTAMPS  2020-04-15T20:20:00+00:00
TIMESTAMPS  2020-04-15T20:19:00+00:00
TIMESTAMPS  2020-04-15T20:18:00+00:00
TIMESTAMPS  2020-04-15T20:17:00+00:00
TIMESTAMPS  2020-04-15T20:16:00+00:00
VALUES  0.0
VALUES  0.0
VALUES  0.0
VALUES  0.0
VALUES  0.0
VALUES  0.0
METRICDATARESULTS   writeProvisionedThroughputExceeded  itagemillis PartialData
TIMESTAMPS  2020-04-15T19:36:00+00:00
TIMESTAMPS  2020-04-15T19:35:00+00:00
TIMESTAMPS  2020-04-15T19:34:00+00:00
TIMESTAMPS  2020-04-15T19:33:00+00:00
VALUES  0.0
VALUES  0.0
VALUES  0.0
VALUES  0.0
VALUES  0.0
VALUES  0.0

Any thoughts on either the AWS or splunk side on how best to handle ingesting this data ?

here's the CLI command aws cloudwatch get-metric-data --start-time 16:29 --end-time 23:59 --metric-data-queries file://metric-data-queries.json --output text and contents of metric-data-queries.json

[
  {
    "Id": "iteratorAgeMilliseconds",
    "MetricStat": {
      "Metric": {
        "Namespace": "AWS/Kinesis",
        "MetricName": "GetRecords.IteratorAgeMilliseconds",
        "Dimensions": [
          {
            "Name": "StreamName",
            "Value": "test.dev.com"
          }
        ]
      },
      "Period": 1,
       "Stat": "Sum",
        "Unit": "Count"
    },
    "Label": "itagemillis",
    "ReturnData": true
  },
  {
    "Id": "readProvisionedThroughputExceeded",
    "MetricStat": {
      "Metric": {
        "Namespace": "AWS/Kinesis",
        "MetricName": "ReadProvisionedThroughputExceeded",
        "Dimensions": [
          {
            "Name": "StreamName",
            "Value": "test.dev.com"
          }
        ]
      },
      "Period": 1,
       "Stat": "Sum",
        "Unit": "Count"
    },
    "Label": "itagemillis",
    "ReturnData": true
  },
    {
    "Id": "writeProvisionedThroughputExceeded",
    "MetricStat": {
      "Metric": {
        "Namespace": "AWS/Kinesis",
        "MetricName": "WriteProvisionedThroughputExceeded",
        "Dimensions": [
          {
            "Name": "StreamName",
            "Value": "test.dev.com"
          }
        ]
      },
      "Period": 1,
       "Stat": "Sum",
        "Unit": "Count"
    },
    "Label": "itagemillis",
    "ReturnData": true
  },
    {
    "Id": "putRecordSuccess",
    "MetricStat": {
      "Metric": {
        "Namespace": "AWS/Kinesis",
        "MetricName": "PutRecord.Success",
        "Dimensions": [
          {
            "Name": "StreamName",
            "Value": "test.dev.com"
          }
        ]
      },
      "Period": 1,
       "Stat": "Sum",
        "Unit": "Count"
    },
    "Label": "itagemillis",
    "ReturnData": true
  },
    {
    "Id": "putRecordsSuccess",
    "MetricStat": {
      "Metric": {
        "Namespace": "AWS/Kinesis",
        "MetricName": "PutRecords.Success",
        "Dimensions": [
          {
            "Name": "StreamName",
            "Value": "test.dev.com"
          }
        ]
      },
      "Period": 1,
       "Stat": "Sum",
        "Unit": "Count"
    },
    "Label": "itagemillis",
    "ReturnData": true
  },
    {
    "Id": "getRecordsSuccess",
    "MetricStat": {
      "Metric": {
        "Namespace": "AWS/Kinesis",
        "MetricName": "GetRecords.Success",
        "Dimensions": [
          {
            "Name": "StreamName",
            "Value": "test.dev.com"
          }
        ]
      },
      "Period": 1,
       "Stat": "Sum",
        "Unit": "Count"
    },
    "Label": "itagemillis",
    "ReturnData": true
  }
]
Tony
  • 8,681
  • 7
  • 36
  • 55

1 Answers1

0

You will find that Splunk handles JSON pretty well out of the box, so I would recommend using that over the other options. You may need to set the KV_MODE=JSON for the sourcetype you ingest, but it should do it by default.

See more here for example, https://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Automatickey-valuefieldextractionsatsearch-time

You can also look at using the Splunk Apps to integrate with AWS, such as the Splunk Add-on for AWS, https://splunkbase.splunk.com/app/1876/ , and the Splunk Add-On for Amazon Kinesis Firehose, https://splunkbase.splunk.com/app/3719/

Simon Duff
  • 2,631
  • 2
  • 7
  • 15
  • I tried json but it’s having an issue parsing out all of time stamps and values as individual events. It gives the entire json one time stamp – Tony Apr 15 '20 at 23:18
  • https://answers.splunk.com/answers/289520/how-to-split-a-json-array-into-multiple-events-wit.html You may be able to BREAK_ONLY_BEFORE={"Id" – Simon Duff Apr 15 '20 at 23:35