1

Network Load Balancing, in GCP.

System used : 3 Servers System - > Nginx <--> PHP-fpm (using instance groups) <--> NFS. Nginx <--> NFS

Now i used health checking but sometimes health check restart the apps that still running and pass through network with not in a good way (means sometimes php-fpm did not run smooth to pass through to nginx) and this happen since 23 Dec 2020, before it. It runs very smoothly.

PS: i used Jakarta DC for GCP. and error when i see on serial of one server in instance group:

Jan 26 10:17:50 php-backend-8s46 collectd[1532]: write_gcm: curl_easy_perform() failed: Timeout was reached Jan 26 10:18:33 php-backend-8s46 collectd[1532]: write_gcm: Error talking to the endpoint. Jan 26 10:18:45 php-backend-8s46 collectd[1532]: write_gcm: wg_transmit_unique_segment failed. Jan 26 10:18:55 php-backend-8s46 collectd[1532]: write_gcm: wg_transmit_unique_segments failed. Flushing. Jan 26 10:19:00 php-backend-8s46 collectd[1532]: write_gcm: can not take infinite value Jan 26 10:20:07 php-backend-8s46 collectd[1532]: write_gcm: wg_typed_value_create_from_value_t_inline failed for swap/percent/value! Continuing. Jan 26 10:21:08 php-backend-8s46 collectd[1532]: write_gcm: can not take infinite value Jan 26 10:21:55 php-backend-8s46 collectd[1532]: write_gcm: wg_typed_value_create_from_value_t_inline failed for swap/percent/value! Continuing. Jan 26 10:23:21 php-backend-8s46 collectd[1532]: write_gcm: can not take infinite value Jan 26 10:23:21 php-backend-8s46 collectd[1532]: write_gcm: wg_typed_value_create_from_value_t_inline failed for swap/percent/value! Continuing. Jan 26 10:23:21 php-backend-8s46 collectd[1532]: write_gcm: curl_easy_perform() failed: Timeout was reached Jan 26 10:23:21 php-backend-8s46 collectd[1532]: write_gcm: Error -1 from wg_curl_get_or_post Jan 26 10:23:21 php-backend-8s46 collectd[1532]: write_gcm: wg_transmit_unique_segment failed. Jan 26 10:23:21 php-backend-8s46 collectd[1532]: write_gcm: wg_transmit_unique_segments failed. Flushing. Jan 26 10:23:21 php-backend-8s46 collectd[1532]: uc_update: Value too old: name = php-backend-8s46/processes-all/io_octets; value time = 1611631113.168; last cache update = 1611631113.168; Jan 26 10:23:21 php-backend-8s46 collectd[1532]: uc_update: Value too old: name = php-backend-8s46/processes-all/io_octets; value time = 1611631113.167; last cache update = 1611631113.168; Jan 26 10:23:21 php-backend-8s46 collectd[1532]: uc_update: Value too old: name = php-backend-8s46/processes-all/ps_rss; value time = 1611631113.942; last cache update = 1611631113.942; Jan 26 10:23:21 php-backend-8s46 collectd[1532]: uc_update: Value too old: name = php-backend-8s46/processes-all/ps_rss; value time = 1611631113.943; last cache update = 1611631113.943; Jan 26 10:23:21 php-backend-8s46 collectd[1532]: uc_update: Value too old: name = php-backend-8s46/processes-all/disk_octets; value time = 1611631113.943; last cache update = 1611631113.944;

1 Answers1

1

These errors indicate issues with Google Cloud Monitoring agent configuration.
Check if you have Stackdriver API enabled (it's not enabled by default).
And make sure that service account for this instance has proper permissions to write to Stackdriver:

gcloud projects add-iam-policy-binding PROJECT_NAME --member="serviceAccount:SERVICE_ACCOUNT_EMAIL" --role="roles/logging.logWriter"
 
gcloud projects add-iam-policy-binding PROJECT_NAME --member="serviceAccount:SERVICE_ACCOUNT_EMAIL" --role="roles/monitoring.metricWriter"

If you are still seeing these errors:

write_gcm: can not take infinite value  
write_gcm: wg_typed_value_create_from_value_t_inline failed for swap/percent/value! Continuing.

then edit /etc/stackdriver/collectd.conf and remove following part:

LoadPlugin swap
<Plugin "swap">
  ValuesPercentage true
</Plugin>

and restart stackdriver agent.

You can also double-check if your config is in line with these instructions.
If you are still facing errors, try these troubleshooting steps.

Sergiusz
  • 1,175
  • 4
  • 13
  • Hai Sergiusz Rusiecki, Thank you so much for the answer. But do you have any experience on problem of Google Health Check that the apps actually not running smoothly but they say it's running so the system pass through the data but the data did not receive by the apps (in this case php-fpm) so php-fpm did not process pass through process by nginx. btw, i will do it for your suggestion of issue that being collected by serial log GCP. – Melki Saputro Feb 03 '21 at 15:34
  • Please share more details on your setup and health checks. – Sergiusz Feb 04 '21 at 08:51
  • Hai Sergiusz, Sorry for long pause reply. anyway this my configuration : Nginx -- Cluster LB GCP Port 9001 tcp with Healtcheck (php engine) File concentrated into one server NFS (which Nginx and php using the same user and group). Health checks : port: 9001, timeout: 5s, check interval: 10s, unhealthy threshold: 5 attempts. Let me know about your opinion. – Melki Saputro Feb 16 '21 at 22:33
  • 5 attempts * 10s means it would have to fail for more than 50 seconds to be detected, which seems a bit long. Try lowering unhealthy threshold to 3. – Sergiusz Feb 17 '21 at 10:02