I have installed the GCP ops agent into some machines which are in production, to get its metrics and be able to create alerts. They don't have any service account assigned and cannot be changed because they are in production (and some of them includes local SSD). As I cannot set the service account directly on the instance, I have decided to install it manually and I have almost done it.
To do it I have created a service account, and a json key for that account. I have installed that json using the gcloud command:
gcloud auth application-default login --key-file=key-file.json
and I have copied the json to root config folder to use it as default service account for Google applications:
cp key-file.json /root/.config/gcloud/application_default_credentials.json
With those commands I have seen that now I am able to get an application access token using the following command:
gcloud auth application-default print-access-token
And now the metrics collector seems to be working, because I have data on GCP. My problem is that Fluent Bit sill having authentication problems because is unable to get the oauth token:
[2021/12/09 14:01:11] [error] [output:stackdriver:stackdriver.1] can't fetch token from the metadata server
[2021/12/09 14:01:11] [error] [output:stackdriver:stackdriver.1] cannot retrieve oauth2 token
So I have searched a bit more and I have seen that GOOGLE_SERVICE_CREDENTIALS variable should work, but when I edit the systemctl daemon to add the environment variable with the following override:
[Service]
Environment='GOOGLE_SERVICE_CREDENTIALS=/root/.config/gcloud/application_default_credentials.json'
Fluent Bit seems to be able to get the access token now:
[2021/12/09 14:07:56] [ info] [output:stackdriver:stackdriver.0] metadata_server set to http://metadata.google.internal
[2021/12/09 14:07:56] [ info] [oauth2] HTTP Status=200
[2021/12/09 14:07:56] [ info] [oauth2] access token from 'www.googleapis.com:443' retrieved
[2021/12/09 14:07:56] [ info] [output:stackdriver:stackdriver.0] worker #0 started
[2021/12/09 14:07:56] [ info] [output:stackdriver:stackdriver.0] worker #1 started
[2021/12/09 14:07:56] [ info] [output:stackdriver:stackdriver.0] worker #2 started
[2021/12/09 14:07:56] [ info] [output:stackdriver:stackdriver.0] worker #3 started
[2021/12/09 14:07:56] [ info] [output:stackdriver:stackdriver.0] worker #4 started
[2021/12/09 14:07:56] [ info] [output:stackdriver:stackdriver.0] worker #5 started
[2021/12/09 14:07:56] [ info] [output:stackdriver:stackdriver.0] worker #6 started
[2021/12/09 14:07:56] [ info] [output:stackdriver:stackdriver.0] worker #7 started
But it start to crash:
Dec 9 14:05:48 tradeinn-web-pro-mariadb-11 systemd[1]: google-cloud-ops-agent-fluent-bit.service: Main process exited, code=killed, status=6/ABRT
Dec 9 14:05:48 tradeinn-web-pro-mariadb-11 systemd[1]: google-cloud-ops-agent-fluent-bit.service: Failed with result 'signal'.
Dec 9 14:05:48 tradeinn-web-pro-mariadb-11 systemd[1]: google-cloud-ops-agent-fluent-bit.service: Service RestartSec=100ms expired, scheduling restart.
Dec 9 14:05:48 tradeinn-web-pro-mariadb-11 systemd[1]: google-cloud-ops-agent-fluent-bit.service: Scheduled restart job, restart counter is at 4.
Dec 9 14:05:48 tradeinn-web-pro-mariadb-11 systemd[1]: Stopped Google Cloud Ops Agent - Logging Agent.
Dec 9 14:05:48 tradeinn-web-pro-mariadb-11 systemd[1]: Starting Google Cloud Ops Agent - Logging Agent...
Dec 9 14:05:48 tradeinn-web-pro-mariadb-11 systemd[1]: Started Google Cloud Ops Agent - Logging Agent.
Dec 9 14:05:48 tradeinn-web-pro-mariadb-11 fluent-bit[27095]: #033[1mFluent Bit v1.8.4#033[0m
Dec 9 14:05:48 tradeinn-web-pro-mariadb-11 fluent-bit[27095]: * #033[1m#033[93mCopyright (C) 2019-2021 The Fluent Bit Authors#033[0m
Dec 9 14:05:48 tradeinn-web-pro-mariadb-11 fluent-bit[27095]: * #033[1m#033[93mCopyright (C) 2015-2018 Treasure Data#033[0m
Dec 9 14:05:48 tradeinn-web-pro-mariadb-11 fluent-bit[27095]: * Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
Dec 9 14:05:48 tradeinn-web-pro-mariadb-11 fluent-bit[27095]: * https://fluentbit.io
Dec 9 14:05:49 tradeinn-web-pro-mariadb-11 fluent-bit[27095]: [2021/12/09 14:05:49] [engine] caught signal (SIGSEGV)
Dec 9 14:05:49 tradeinn-web-pro-mariadb-11 fluent-bit[27095]: [2021/12/09 14:05:49] [engine] caught signal (SIGSEGV)
Dec 9 14:05:49 tradeinn-web-pro-mariadb-11 fluent-bit[27095]: #0 0x5636b208e714 in ???() at ???:0
Dec 9 14:05:49 tradeinn-web-pro-mariadb-11 fluent-bit[27095]: #1 0x5636b200fc87 in ???() at ???:0
Dec 9 14:05:49 tradeinn-web-pro-mariadb-11 fluent-bit[27095]: #2 0x5636b227601f in ???() at ???:0
Dec 9 14:05:49 tradeinn-web-pro-mariadb-11 fluent-bit[27095]: #3 0x5636b208e714 in ???() at ???:0
Dec 9 14:05:49 tradeinn-web-pro-mariadb-11 fluent-bit[27095]: #4 0x5636b200fc87 in ???() at ???:0
Dec 9 14:05:49 tradeinn-web-pro-mariadb-11 fluent-bit[27095]: #5 0x5636b227601f in ???() at ???:0
Is there any way to make it work with a service account json?