0

Found out we have some old cloudwatch-agent (version 1.247354.0) pods handled by a DaemonSet running and I'm not able to understand if they are actually sending something to AWS CloudWatch. How can I do?

I see no errors in agent logs but the cwagent config file looks quite odd to me. It's set to send only logs but I see no logs_collected, log_group or log_stream params as required by docs.

Here is the /etc/cwagentconfig config file loaded from a configMap:

{
  "agent": {
    "region": "eu-west-1"
  },
  "logs": {
    "metrics_collected": {
      "kubernetes": {
        "cluster_name": "One-Of-Many-EKS-Cluster",
        "metrics_collection_interval": 60
      }
    },
    "force_flush_interval": 5
  }
}

And here's the agent log (some info are faked):

2023/05/30 06:11:02 I! D! [EC2] Found active network interface
I! Detected the instance is EC2
2023/05/30 06:10:59 Reading json config file path: /opt/aws/amazon-cloudwatch-agent/bin/default_linux_config.json ...
/opt/aws/amazon-cloudwatch-agent/bin/default_linux_config.json does not exist or cannot read. Skipping it.
2023/05/30 06:10:59 Reading json config file path: /etc/cwagentconfig/..2023_05_30_06_10_40.838205086/cwagentconfig.json ...
2023/05/30 06:10:59 Find symbolic link /etc/cwagentconfig/..data 
2023/05/30 06:10:59 Find symbolic link /etc/cwagentconfig/cwagentconfig.json 
2023/05/30 06:10:59 Reading json config file path: /etc/cwagentconfig/cwagentconfig.json ...
2023/05/30 06:10:59 I! Valid Json input schema.
2023/05/30 06:10:59 I! attempt to access ECS task metadata to determine whether I'm running in ECS.
2023/05/30 06:11:00 W! retry [0/3], unable to get http response from http://just.an.host.ip/v2/metadata, error: unable to get response from http://just.an.host.ip/v2/metadata, error: Get "http://just.an.host.ip/v2/metadata": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
2023/05/30 06:11:01 W! retry [1/3], unable to get http response from http://just.an.host.ip/v2/metadata, error: unable to get response from http://just.an.host.ip/v2/metadata, error: Get "http://just.an.host.ip/v2/metadata": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
2023/05/30 06:11:02 W! retry [2/3], unable to get http response from http://just.an.host.ip/v2/metadata, error: unable to get response from http://just.an.host.ip/v2/metadata, error: Get "http://just.an.host.ip/v2/metadata": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
2023/05/30 06:11:02 I! access ECS task metadata fail with response unable to get response from http://just.an.host.ip/v2/metadata, error: Get "http://just.an.host.ip/v2/metadata": context deadline exceeded (Client.Timeout exceeded while awaiting headers), assuming I'm not running in ECS.
No csm configuration found.
No metric configuration found.
Configuration validation first phase succeeded
 
2023/05/30 06:11:02 I! Config has been translated into TOML /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml 
2023/05/30 06:11:02 D! toml config [agent]
  collection_jitter = "0s"
  debug = false
  flush_interval = "1s"
  flush_jitter = "0s"
  hostname = "ip-XX-XX-XX.eu-west-1.compute.internal"
  interval = "60s"
  logfile = ""
  logtarget = "lumberjack"
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  omit_hostname = true
  precision = ""
  quiet = false
  round_interval = false

[inputs]

  [[inputs.cadvisor]]
    container_orchestrator = "eks"
    interval = "60s"
    mode = "detail"
    [inputs.cadvisor.tags]
      metricPath = "logs"

  [[inputs.k8sapiserver]]
    interval = "60s"
    node_name = "ip-XX-XX-XX.eu-west-1.compute.internal"
    [inputs.k8sapiserver.tags]
      metricPath = "logs_k8sapiserver"

[outputs]

  [[outputs.cloudwatchlogs]]
    force_flush_interval = "5s"
    log_stream_name = "ip-XX-XX-XX.eu-west-1.compute.internal"
    region = "eu-west-1"
    tagexclude = ["metricPath"]
    [outputs.cloudwatchlogs.tagpass]
      metricPath = ["logs", "logs_k8sapiserver"]

[processors]

  [[processors.ec2tagger]]
    disk_device_tag_key = "device"
    ebs_device_keys = ["*"]
    ec2_instance_tag_keys = ["aws:autoscaling:groupName"]
    ec2_metadata_tags = ["InstanceId", "InstanceType"]
    [processors.ec2tagger.tagpass]
      metricPath = ["logs"]

  [[processors.k8sdecorator]]
    cluster_name = "One-Of-Many-EKS-Cluster"
    host_ip = "XX.XX.XX.XX"
    node_name = "ip-XX-XX-XX.eu-west-1.compute.internal"
    order = 1
    prefer_full_pod_name = false
    tag_service = true
    [processors.k8sdecorator.tagpass]
      metricPath = ["logs", "logs_k8sapiserver"]
2023-05-30T06:11:03Z I! Starting AmazonCloudWatchAgent 1.247354.0
2023-05-30T06:11:03Z I! AWS SDK log level not set
2023-05-30T06:11:03Z I! Loaded inputs: cadvisor k8sapiserver
2023-05-30T06:11:03Z I! Loaded aggregators: 
2023-05-30T06:11:03Z I! Loaded processors: ec2tagger k8sdecorator
2023-05-30T06:11:03Z I! Loaded outputs: cloudwatchlogs
2023-05-30T06:11:03Z I! Tags enabled: 
2023-05-30T06:11:03Z I! [agent] Config: Interval:1m0s, Quiet:false, Hostname:"ip-XX-XX-XX.eu-west-1.compute.internal", Flush Interval:1s
2023-05-30T06:11:03Z I! [processors.ec2tagger] ec2tagger: Check ec2 metadata
2023-05-30T06:11:03Z I! [logagent] starting
2023-05-30T06:11:03Z I! [logagent] found plugin cloudwatchlogs is a log backend
2023-05-30T06:15:03Z I! [processors.ec2tagger] ec2tagger: EC2 tagger has started initialization.
2023-05-30T06:15:03Z I! [processors.ec2tagger] ec2tagger: Check ec2 metadata
2023-05-30T06:19:03Z I! [processors.ec2tagger] ec2tagger: EC2 tagger has started initialization.
I0530 06:19:03.756196       1 leaderelection.go:248] attempting to acquire leader lease amazon-cloudwatch/cwagent-clusterleader...
2023-05-30T06:19:03Z I! k8sapiserver Switch New Leader: ip-10-201-65-9.eu-west-1.compute.internal
W0530 06:19:03.815922       1 manager.go:291] Could not configure a source for OOM detection, disabling OOM events: open /dev/kmsg: no such file or directory
2023-05-30T06:19:04Z I! [processors.ec2tagger] ec2tagger: Initial retrieval of tags succeded
2023-05-30T06:19:04Z I! [processors.ec2tagger] ec2tagger: EC2 tagger has started, finished initial retrieval of tags and Volumes
2023-05-30T06:19:07Z W! ReplicaSet initial sync timeout: timed out waiting for the condition
2023-05-30T06:22:14Z I! [processors.ec2tagger] ec2tagger: Refresh is no longer needed, stop refreshTicker.
2023-05-30T06:23:04Z I! [processors.ec2tagger] ec2tagger: Initial retrieval of tags succeded
2023-05-30T06:23:04Z I! [processors.ec2tagger] ec2tagger: EC2 tagger has started, finished initial retrieval of tags and Volumes
2023-05-30T06:26:14Z I! [processors.ec2tagger] ec2tagger: Refresh is no longer needed, stop refreshTicker.
2023-05-30T06:30:03Z I! no pod is found for namespace:a-namespace,podName:a-indexerror-28090470-52zzv, refresh the cache now...
2023-05-30T07:00:03Z I! no pod is found for namespace:a-namespace,podName:a-indexerror-28090500-xwnlx, refresh the cache now...
2023-05-30T07:30:03Z I! no pod is found for namespace:b-namespace,podName:jobreindex-check-28090530-mntg4, refresh the cache now...
2023-05-30T07:52:03Z I! no pod is found for namespace:amazon-cloudwatch,podName:fluentd-cloudwatch-rpkln, refresh the cache now...
2023-05-30T08:00:03Z I! no pod is found for namespace:b-namespace,podName:jobs3ver-indexerror-28090560-c2k47, refresh the cache now...
Stefano Lazzaro
  • 387
  • 1
  • 4
  • 22

0 Answers0