We just found some logs are missing from Stackdriver,
We can use kubectl logs
for listing the logs message but some of them are not send to Stackdriver logs for some reason.
An example of a log entry that missing:
{"severity":"info","time":"2021-06-07T08:19:17.598Z","caller":"zap/options.go:212","msg":"finished unary call with code OK","grpc.start_time":"2021-06-07T08:19:17Z","system":"grpc","span.kind":"server","grpc.service":"manabie.tom.ChatService","grpc.method":"SendMessage","peer.address":"127.0.0.1:32806","userID":"xxxx","x-request-id":"xxxx","grpc.code":"OK","grpc.time_ms":48.04899978637695}
Checking fluentbit daemon:
kubectl logs fluentbit-gke-xxxx -c fluentbit-gke -f --tail=1
I see some error logs like:
W0607 08:16:55.066861 1 server.go:77] Received empty or invalid msgpack for tag kube_xxxxxxxx
W0607 08:16:59.072151 1 server.go:77] Received empty or invalid msgpack for tag kube_xxxxxxxx
Describe daemon set:
kubectl describe daemonset fluentbit-gke
Name: fluentbit-gke
Selector: component=fluentbit-gke,k8s-app=fluentbit-gke
Node-Selector: kubernetes.io/os=linux
Labels: addonmanager.kubernetes.io/mode=Reconcile
k8s-app=fluentbit-gke
kubernetes.io/cluster-service=true
Annotations: deprecated.daemonset.template.generation: 9
Desired Number of Nodes Scheduled: 4
Current Number of Nodes Scheduled: 4
Number of Nodes Scheduled with Up-to-date Pods: 4
Number of Nodes Scheduled with Available Pods: 4
Number of Nodes Misscheduled: 0
Pods Status: 4 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: component=fluentbit-gke
k8s-app=fluentbit-gke
kubernetes.io/cluster-service=true
Annotations: EnableNodeJournal: false
EnablePodSecurityPolicy: false
SystemOnlyLogging: false
components.gke.io/component-name: fluentbit
components.gke.io/component-version: 1.4.4
monitoring.gke.io/path: /api/v1/metrics/prometheus
Service Account: fluentbit-gke
Containers:
fluentbit:
Image: gke.gcr.io/fluent-bit:v1.5.7-gke.1
Port: 2020/TCP
Host Port: 2020/TCP
Limits:
memory: 250Mi
Requests:
cpu: 50m
memory: 100Mi
Liveness: http-get http://:2020/ delay=120s timeout=1s period=60s #success=1 #failure=3
Environment: <none>
Mounts:
/fluent-bit/etc/ from config-volume (rw)
/var/lib/docker/containers from varlibdockercontainers (ro)
/var/lib/kubelet/pods from varlibkubeletpods (rw)
/var/log from varlog (rw)
/var/run/google-fluentbit/pos-files from varrun (rw)
fluentbit-gke:
Image: gke.gcr.io/fluent-bit-gke-exporter:v0.16.2-gke.0
Port: 2021/TCP
Host Port: 2021/TCP
Command:
/fluent-bit-gke-exporter
--kubernetes-separator=_
--stackdriver-resource-model=k8s
--enable-pod-label-discovery
--pod-label-dot-replacement=_
--split-stdout-stderr
--logtostderr
Limits:
memory: 250Mi
Requests:
cpu: 50m
memory: 100Mi
Liveness: http-get http://:2021/healthz delay=120s timeout=1s period=60s #success=1 #failure=3
Environment: <none>
Mounts: <none>
Volumes:
varrun:
Type: HostPath (bare host directory volume)
Path: /var/run/google-fluentbit/pos-files
HostPathType:
varlog:
Type: HostPath (bare host directory volume)
Path: /var/log
HostPathType:
varlibkubeletpods:
Type: HostPath (bare host directory volume)
Path: /var/lib/kubelet/pods
HostPathType:
varlibdockercontainers:
Type: HostPath (bare host directory volume)
Path: /var/lib/docker/containers
HostPathType:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: fluentbit-gke-config-v1.0.6
Optional: false
Priority Class Name: system-node-critical
Events: <none>