I'm using the fluent-operator to deploy fluentbit and fluentd.
Fluentbit collects and enriches the logs with Kubernetes metadata, then forwards to Fluentd.
Fluentd ships the logs to AWS OpenSearch.
I was trying to add a namespace label to the logstashPrefix like this:
apiVersion: fluentd.fluent.io/v1alpha1
kind: ClusterOutput
metadata:
name: cluster-fluentd-output-os
labels:
output.fluentd.fluent.io/scope: "cluster"
output.fluentd.fluent.io/enabled: "true"
spec:
outputs:
- customPlugin:
config: |
<match **>
@type opensearch
host "${FLUENT_OPENSEARCH_HOST}"
port 443
logstash_format true
logstash_prefix logs-$${record['kubernetes']['namespace_name']['labels']['org']}
scheme https
<endpoint>
url "https://${FLUENT_OPENSEARCH_HOST}"
region "${FLUENT_OPENSEARCH_REGION}"
assume_role_arn "#{ENV['AWS_ROLE_ARN']}"
assume_role_web_identity_token_file "#{ENV['AWS_WEB_IDENTITY_TOKEN_FILE']}"
</endpoint>
</match>
logLevel: info
But it ended up creating an index with the Ruby expression in it:
logs-${record['kubernetes']['namespace_name']['labels']['org']}-2023.04.04
This is because fluentbit doesn't enrich the logs with namespace metadata besides the namespace name. Here's an example log:
{
"_index": "logs-${record['kubernetes']['pod_name']}-2023.04.04",
"_id": "XXX-XXX",
"_version": 1,
"_score": null,
"_source": {
"log": "2023-04-04T22:19:06.572970297Z stdout F Notice: XXXX for 'XXXX'",
"kubernetes": {
"pod_name": "dockerhub-limit-exporter-XXX-XXX",
"namespace_name": "observability-system",
"labels": {
"app.kubernetes.io/instance": "dockerhub-limit-exporter",
"app.kubernetes.io/name": "dockerhub-limit-exporter",
"pod-template-hash": "XXX"
},
"annotations": {
"kubernetes.io/psp": "eks.privileged"
},
"container_name": "dockerhub-limit-exporter",
"docker_id": "XXXX",
"container_image": "XXX.XXX.XXX.com/XXX/docker-hub-limit-exporter:0.0.0"
},
"@timestamp": "2023-04-04T22:19:06.576573487+00:00"
},
"fields": {
"@timestamp": [
"2023-04-04T22:19:06.576Z"
]
},
"highlight": {
"kubernetes.namespace_name": [
"@opensearch-dashboards-highlighted-field@observability@/opensearch-dashboards-highlighted-field@-@opensearch-dashboards-highlighted-field@system@/opensearch-dashboards-highlighted-field@"
]
},
"sort": [
1680646746576
]
}
This my actual fluentbit and fluentd configuration:
apiVersion: fluentbit.fluent.io/v1alpha2
kind: FluentBit
metadata:
labels:
app.kubernetes.io/name: fluent-bit
name: fluent-bit
namespace: fluent-system
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-role.kubernetes.io/edge
operator: DoesNotExist
fluentBitConfigName: fluent-bit-config
image: kubesphere/fluent-bit:${FLUENTBIT_IMAGE_TAG:=v2.0.9}
imagePullSecrets:
- name: image-pull-secret
positionDB:
hostPath:
path: /var/lib/fluent-bit/
resources:
limits:
cpu: ${FLUENTBIT_CPU_LIMIT:=500m}
memory: ${FLUENTBIT_MEMORY_LIMIT:=200Mi}
requests:
cpu: ${FLUENTBIT_CPU_REQUEST:=10m}
memory: ${FLUENTBIT_MEMORY_REQUEST:=25Mi}
tolerations:
- operator: Exists
---
apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterInput
metadata:
labels:
fluentbit.fluent.io/component: logging
fluentbit.fluent.io/enabled: "true"
name: tail
spec:
tail:
db: /fluent-bit/tail/pos.db
dbSync: Normal
memBufLimit: ${FLUENTBIT_MEM_BUF_LIMIT:=5MB}
parser: docker
path: /var/log/containers/*.log
refreshIntervalSeconds: 10
skipLongLines: true
tag: kube.*
---
apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterInput
metadata:
labels:
fluentbit.fluent.io/component: logging
fluentbit.fluent.io/enabled: "true"
name: docker
spec:
systemd:
db: /fluent-bit/tail/systemd.db
dbSync: Normal
path: /var/log/journal
systemdFilter:
- _SYSTEMD_UNIT=docker.service
- _SYSTEMD_UNIT=kubelet.service
tag: service.*
---
apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterFluentBitConfig
metadata:
labels:
app.kubernetes.io/name: fluent-bit
name: fluent-bit-config
spec:
filterSelector:
matchLabels:
fluentbit.fluent.io/enabled: "true"
inputSelector:
matchLabels:
fluentbit.fluent.io/enabled: "true"
outputSelector:
matchLabels:
fluentbit.fluent.io/enabled: "true"
service:
parsersFile: parsers.conf
---
apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterFilter
metadata:
labels:
fluentbit.fluent.io/component: logging
fluentbit.fluent.io/enabled: "true"
name: kubernetes
spec:
filters:
- kubernetes:
annotations: true
kubeCAFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
kubeTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
kubeURL: https://kubernetes.default.svc:443
labels: true
- nest:
addPrefix: kubernetes_
nestedUnder: kubernetes
operation: lift
- modify:
rules:
- remove: stream
- remove: kubernetes_pod_id
- remove: kubernetes_host
- remove: kubernetes_container_hash
- nest:
nestUnder: kubernetes
operation: nest
removePrefix: kubernetes_
wildcard:
- kubernetes_*
match: kube.*
---
apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterOutput
metadata:
labels:
fluentbit.fluent.io/component: logging
fluentbit.fluent.io/enabled: "true"
name: fluentd
spec:
forward:
host: fluentd.fluent-system.svc
port: 24224
matchRegex: (?:kube|service)\.(.*)
---
apiVersion: fluentd.fluent.io/v1alpha1
kind: Fluentd
metadata:
name: fluentd
namespace: fluent-system
labels:
app.kubernetes.io/name: fluentd
spec:
globalInputs:
- forward:
bind: 0.0.0.0
port: 24224
replicas: 1
image: kubesphere/fluentd:${FLUENTD_IMAGE_TAG:=v1.15.3}
imagePullSecrets:
- name: image-pull-secret
resources:
limits:
cpu: ${FLUENTD_CPU_LIMIT:=500m}
memory: ${FLUENTD_MEMORY_LIMIT:=500Mi}
requests:
cpu: ${FLUENTD_CPU_REQUEST:=100m}
memory: ${FLUENTD_MEMORY_REQUEST:=128Mi}
fluentdCfgSelector:
matchLabels:
config.fluentd.fluent.io/enabled: "true"
---
apiVersion: fluentd.fluent.io/v1alpha1
kind: ClusterFluentdConfig
metadata:
labels:
config.fluentd.fluent.io/enabled: "true"
name: fluentd-config
spec:
clusterFilterSelector:
matchLabels:
filter.fluentd.fluent.io/enabled: "true"
clusterOutputSelector:
matchLabels:
output.fluentd.fluent.io/enabled: "true"
watchedNamespaces: [] # watches all namespaces when empty
---
apiVersion: fluentd.fluent.io/v1alpha1
kind: ClusterOutput
metadata:
name: cluster-fluentd-output-os
labels:
output.fluentd.fluent.io/scope: "cluster"
output.fluentd.fluent.io/enabled: "true"
spec:
outputs:
- customPlugin:
config: |
<match **>
@type opensearch
host "${FLUENT_OPENSEARCH_HOST}"
port 443
logstash_format true
logstash_prefix logs-$${record['kubernetes']['namespace_name']['labels']['org']}
scheme https
<endpoint>
url "https://${FLUENT_OPENSEARCH_HOST}"
region "${FLUENT_OPENSEARCH_REGION}"
assume_role_arn "#{ENV['AWS_ROLE_ARN']}"
assume_role_web_identity_token_file "#{ENV['AWS_WEB_IDENTITY_TOKEN_FILE']}"
</endpoint>
</match>
logLevel: info
How can dynamic indexing with k8s metadata be achieved leveraging the fluent-operator features?