4

I am setting up pipeline to send the kubernetes pods log to elastic cluster. I have installed filebeat as deamonset (stream: stdout) in my cluster and connected output to logstash. Beats is connected with logstash without an issue, now i want logs from application namespaces not from all namespaces in cluster. can someone guide me how to filter this in beat adn also how can to see the source message from json in es?

This is my config:

data:
  kubernetes.yml: |-
    - type: docker
      containers:
        path: "/var/lib/docker/containers"
        stream: "stdout"
        ids: "*"
        multiline.pattern: '^\s'
        multiline.match: after
      fields:
         logtype: container
      multiline:
         pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'
         negate: true
         match: after
      ignore_older: 1h
      processors:
        - add_kubernetes_metadata:
            in_cluster: true
        - decode_json_fields:
            fields: ["log"]
            overwrite_keys: true
            target: ""

Output in kibana:


{
  "_index": "filebeat-6.8.4-2020.03.06",
  "_type": "doc",
  "_id": "vHkzsHABJ57Tsdxxxxx",
  "_version": 1,
  "_score": null,
  "_source": {
    "log": {
      "file": {
        "path": "/var/lib/docker/containers/aa54562be9448183d69d8d2e1953e74560309176f044aed23484ac9e3260982c/sdnksdsdlsdnfsdlfslfnsdslfnsnlnflksdnflkdsfnsdflsdfndslffndslf-json.log"
      }
    },
    "tags": [
      "beats_input_codec_plain_applied",
      "_grokparsefailure"
    ],
    "input": {
      "type": "docker"
    },
    "@version": "1",
    "prospector": {
      "type": "docker"
    },
    "beat": {
      "version": "6.8.4",
      "name": "filebeat-vtp2f",
      "hostname": "filebeat-vtp2f"
    },
    "host": {
      "name": "filebeat-vtp2f"
    },
    "offset": 5798785,
    "stream": "stdout",
    "fields": {
      "logtype": "container"
    },
    "kubernetes": {
      "node": {
        "name": "k8-test-22313607-0"
      },
      "labels": {
        "version": "v1",
        "kubernetes": {
          "io/cluster-service": "true"
        },
        "controller-revision-hash": "6b56cfcb69",
        "pod-template-generation": "1",
        "k8s-app": "fluent"
      },
      "container": {
        "name": "fluentd"
      },
      "pod": {
        "uid": "72c50b54-5ef0-11ea-83e1-26018882335d",
        "name": "fluent-4lft2"
      },
      "namespace": "fluentd"
    },
    "source": "/var/lib/docker/containers/aa54562be9448183d69d8d2e1953e74560309176f044aed23484ac9e3260982c/aa54562be9448183d69d8d2e1953e74560309176f044aed23484ac9e3260982c-json.log",
    "@timestamp": "2020-03-06T14:15:18.561Z"
  },
  "fields": {
    "@timestamp": [
      "2020-03-06T14:15:18.561Z"
    ]
  },
  "highlight": {
    "prospector.type": [
      "@kibana-highlighted-field@docker@/kibana-highlighted-field@"
    ]
  },
  "sort": [
    1583504118561
  ]
}
Pang
  • 9,564
  • 146
  • 81
  • 122
paulpuvi
  • 45
  • 1
  • 1
  • 7

3 Answers3

5

If you want Filebeat to only grab logs from certain namespaces you use a condition:

filebeat.yml:

    logging.level: error
    logging.json: true
    filebeat.config:
      inputs:
        # Mounted `filebeat-inputs` configmap:
        path: ${path.config}/inputs.d/*.yml
        # Reload inputs configs as they change:
        reload.enabled: false
      modules:
        path: ${path.config}/modules.d/*.yml
        # Reload module configs as they change:
        reload.enabled: false
    filebeat.autodiscover:
      providers:
        - type: kubernetes
          templates:
          - condition:
              equals:
                kubernetes.namespace: stage
            config:
              - type: container
                paths:
                 - /var/log/containers/*${data.kubernetes.container.id}.log
                multiline.pattern: '^[[:space:]]'
                multiline.negate: false
                multiline.match: after
                include_lines: ['^{']

Note this part:

          templates:
          - condition:
              equals:
                kubernetes.namespace: stage

I do run a Filebeat as a Daemonset in each Namespace. It's a bit of extra overhead but Filebeat can be finicky so that does help us work out issues in other logical environments first.

Pang
  • 9,564
  • 146
  • 81
  • 122
matt
  • 708
  • 1
  • 6
  • 11
  • how about running filebeat per application? most doc online expect filebeat for the cluster...but what if i want filebeat to ship logs per app deployed inside cluster??? – uberrebu May 19 '21 at 08:39
  • condition where I'm matching on `kubernetes.namespace`. There are other types you can match on: https://www.elastic.co/guide/en/beats/filebeat/current/configuration-autodiscover.html#_kubernetes – matt May 20 '21 at 16:51
5

how to drop some namespaces, i documented here: https://ezyforanykey.blogspot.com/2020/11/filebeat-exclude-kubernetes-namespace.html

example is below:

- type: container
      paths:
        - /var/log/containers/*.log
      exclude_files:
        - /var/log/containers/java.*
      processors:
        - add_kubernetes_metadata:
            host: ${NODE_NAME}
            matchers:
            - logs_path:
                logs_path: "/var/log/containers/"
        - drop_event.when:
            or:
            - equals:
                kubernetes.namespace: "kube-system"
            - equals:
                kubernetes.namespace: "calico-system"
user_1771
  • 152
  • 1
  • 3
  • 8
  • how about running filebeat per application? most doc online expect filebeat for the cluster...but what if i want filebeat to ship logs per app deployed inside cluster??? – uberrebu May 19 '21 at 08:38
0

I don't know how to filter filebeat (or even if it's possible), but you can filter on fields in the output part of your logstash configuration, using conditionals:

output {
    if [kubernetes][namespace] == "fluentd" {
        ...
        Send to Elasticsearch
        ...
    } else {
        ...
    }
}

This way you can choose different actions to take on each message, depending on the value of the kubernetes.namespace field.

baudsp
  • 4,076
  • 1
  • 17
  • 35
  • thank you, this helped me to achieve the results but in different way. I have used the condition in filter section to get the desired output. is there a way we can use contains operator in if logic..i.e if kubernetes.node.name contains 'test' , it should create index with test suffix. – paulpuvi Mar 10 '20 at 08:13