1

I would like to forward the logs of select services running on my EKS cluster to CloudWatch for cluster-independent storage and better observability.

Following the quickstart outlined at https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Container-Insights-setup-EKS-quickstart.html I've managed to get the logs forwarded via Fluent Bit service, but that has also generated 170 Container Insights metrics channels. Not only are those metrics not required, but they also appear to cost a fair bit.

How can I disable the collection of cluster metrics such as cpu / memory / network / etc, and only keep forwarding container logs to CloudWatch? I'm having a very hard time finding any documentation on this.

Toms Mikoss
  • 9,097
  • 10
  • 29
  • 41

2 Answers2

0

I think I figured it out - the cloudwatch-agent daemonset from quickstart guide is what's sending the metrics, but it's not required for log forwarding. All the objects with names related to cloudwatch-agent in quickstart yaml file are not required for log forwarding.

Toms Mikoss
  • 9,097
  • 10
  • 29
  • 41
  • Could you expand on that a bit? Are you getting the log's forwarded frequently? How did you change the configuration or startup? – Ernesto Jul 04 '22 at 09:07
  • @Ernesto Not sure what to expand there - I just deleted every resource with name related to `cloudwatch-agent` from the amazon quickstart yaml before applying it. That achieved my goal of getting only the logs, and not the CPU/RAM metrics. And the logs seem to appear in Cloudwatch in "real time", that is moments after the action that causes the log write. Good enough for the use case of "inspect logs later on", but I haven't measured the exact speed. – Toms Mikoss Jul 05 '22 at 09:10
  • The edit queue on this answer is full, so I am commenting and writing a second answer, but credit to Toms since he found it. To stop metric creation you need to delete the entire object named `metrics` in the JSON or YAML configuration file – Ernesto Jul 06 '22 at 07:56
0

As suggested by Toms Mikoss, you need to delete the metrics object in your configuration file. This file is the one that you pass to the agent when starting

This applies to "on-premises" "linux" installations. I havent tested this on windows, nor EC2 but I imagine it will be similar. The AWS Documentation here says that you can also distribute the configuration via SSM, but again, I imagine the answer here is still applicable.

Example of file with metrics:

{
    "agent": {
        "metrics_collection_interval": 60,
        "run_as_user": "root"
    },
    "logs": {
        "logs_collected": {
            "files": {
                "collect_list": [
                    {
                        "file_path": "/var/log/nginx.log",
                        "log_group_name": "nginx",
                        "log_stream_name": "{hostname}"
                    }
                ]
            }
        }
    },
    "metrics": {
        "metrics_collected": {
            "cpu": {
                "measurement": [
                    "cpu_usage_idle",
                    "cpu_usage_iowait"
                ],
                "metrics_collection_interval": 60,
                "totalcpu": true
            }
        }
    }
}

Example of file without metrics:

{
    "agent": {
        "metrics_collection_interval": 60,
        "run_as_user": "root"
    },
    "logs": {
        "logs_collected": {
            "files": {
                "collect_list": [
                    {
                        "file_path": "/var/log/nginx.log",
                        "log_group_name": "nginx",
                        "log_stream_name": "{hostname}"
                    }
                ]
            }
        }
    }
}

For reference, the command to start for linux on-premises servers:

sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config \
 -m onPremise -s -c file:configuration-file-path

More details in the AWS Documentation here

Ernesto
  • 605
  • 1
  • 13
  • 30