2

I'd like Envoy's Istio access logs (i.e. logs that record every HTTP request) to show up somewhere inside Stackdriver logging. I've tried following the steps in https://istio.io/docs/tasks/telemetry/logs/access-log/. However, the default accessLogFile setting for Istio on GKE seems to be empty, and if I try to change it with kubectl edit configmap -n istio-system istio, it gets reset by the system after a few minute.

Is there a way to push Istio on GKE's access logs into Stackdriver?

MrMage
  • 7,282
  • 2
  • 41
  • 71

2 Answers2

4

For the Google managed version of Istio (enabled by checking the box on your GKE cluster) then versions 1.13 and above have the access logs disabled by default, having the configmap accessLogFile: "". On 1.12 or older versions, the access logs are enabled by default so the configmap has accessLogFile: "/dev/stdout".

As you have noted, you are unable to change it as the reconciliation will wipe the change.

I logged a support case with Google to find out the best approach and they suggested using the Mixer logs instead. To access these, you need Stackdriver enabled on your GKE cluster (either legacy or the newer Kubernetes Engine monitoring). You can then use the filter logName="projects/[PROJECT-NAME]/logs/server-accesslog-stackdriver.logentry.istio-system".

To see the requests between two services you would use this Stackdriver query:

logName="projects/[PROJECT-NAME]/logs/server-accesslog-stackdriver.logentry.istio-system"
labels.destination_app="[YOUR-SERVICE]"
labels.source_app="[YOUR-OTHER-SERVICE]"

To see the requests originating from outside GKE and flowing through the Istio Ingress Gateway:

logName="projects/[YOUR-PROJECT]/logs/server-accesslog-stackdriver.logentry.istio-system"
labels.destination_app="[YOUR-SERVICE]"
labels.source_app="istio-ingressgateway"

These logs aren't 100% equivalent to the proxy access logs however, and may not help troubleshooting all scenarios. There is a feature request open with Google to support customization of the Istio config-map including the accessLogFile setting: https://issuetracker.google.com/issues/126527530

I'd suggest anyone interested in this feature should vote for it by adding a star.

Hope that helps!

Davep
  • 41
  • 4
  • Thank you for the answer! I just checked, and it looks like the Mixer access logs only include requests sent to the Mixer service (i.e. with URL `/istio.mixer.v1.Mixer/Report`). So they are not useful to find out what services inside my cluster were accessed. – MrMage Aug 19 '19 at 08:08
  • 1
    You're right. The mixer logs don't seem particularly helpful for troubleshooting requests flowing through the envoy proxy. I've re-opened my case with Google and will update my answer if they provide anyhing helpful. – Davep Aug 21 '19 at 03:42
  • In my case there were requests to my services within the Mixer logs but somewhat hidden in all the noise. I've added an explanation of some more specific queries to my answer which should help you find your requests. – Davep Aug 28 '19 at 00:31
  • Sadly the Mixer logs are very incomplete in my case. I have commented on and starred the issue you mentioned, thank you! – MrMage Aug 29 '19 at 10:57
  • The `logName` syntax seems to have changed: `logName="projects/[PROJECT-NAME]/logs/server-accesslog-stackdriver.instance.istio-system"` Note the `instance` instead of the `logentry` – Saverio Proto Jun 19 '20 at 17:13
-1

In GKE, all stdout and stderr gets collected and sent to the node's log-rotate for later parsing and exporting into Stackdriver, via Fluentd.

The Access Logs are available using kubectl logs command, meaning that they're in the node and they're being parsed and exported using the Fluentd agent.

I replicated this and was able to find the Access Logs using the following Stackdriver advaced filter (change it according to your own resources):

resource.type="container"
resource.labels.cluster_name="gke-cluster"
resource.labels.namespace_id="application-namespace"
resource.labels.project_id="project-id"
resource.labels.zone:"gcp-zone1-a"
resource.labels.container_name="istio-proxy"
resource.labels.pod_id:"sleep-"

The important lines are the resource.labels.container_name="istio-proxy" to query the istio-proxy container and looking into each replica of the interested pod with resource.labels.pod_id:"sleep-".

Regarding the configMap, since GKE is a managed Kubernetes implementation, you're not supposed to change many configurations, including Fluentd. A reconciliation loop will automatically reset any changes attempted to these resources.

If you really need it, you can deploy your own unmanaged version of Fluentd when using GKE.

yyyyahir
  • 2,262
  • 1
  • 5
  • 14
  • Are you using the official "Istio on GKE" distribution or the OSS version? With the GKE version, the `istio-proxy` logs only contain a few rare debug messages, but no access logs. – MrMage Jul 19 '19 at 09:32
  • I've tried both and was able to see the incoming 418 GET requests (in `sleep` and `httpbin`). Maybe you have some [Stackdriver exclusions](https://cloud.google.com/logging/docs/exclusions) set? – yyyyahir Jul 19 '19 at 17:45
  • I just ran `kubectl` logs on my `istio-proxy` container, and even that didn't show any logs. What does your ConfigMap show for the `accessLogFile` when using Istio on GKE? – MrMage Jul 22 '19 at 08:06
  • After replacing it using the [Helm template](https://istio.io/docs/tasks/telemetry/logs/access-log/#enable-envoy-s-access-logging) it shows the following line: `# Set accessLogFile to empty string to disable access log. accessLogFile: "/dev/stdout"`. The lack of logs from `kubectl logs` shows that this is an Istio level issue rather than Stackdriver. Looks to me that it isn't logging, maybe reapply the Helm template. – yyyyahir Jul 22 '19 at 08:26
  • I asked about Istio on GKE, where I can't replace the Helm template. Do you really get access logs with Istio on GKE, and what does the ConfigMap look like there? – MrMage Jul 22 '19 at 08:39
  • I was replacing it but I launched a new cluster and noticed the line is there by default in `istio-system` namespace. This is GKE 1.12.8-gke.10 with Istio in permissive mode. – yyyyahir Jul 22 '19 at 08:50
  • Just retried and it worked. Basically just 1. labeled the namespace to allow sidecar injection, 2. [applied the sleep YAML file, 3. applied the httpbin YAML file](https://istio.io/docs/tasks/telemetry/logs/access-log/#before-you-begin), 4. [curled httpbin from sleep](https://istio.io/docs/tasks/telemetry/logs/access-log/#test-the-access-log) and 5. checked the Stackdriver logs using the advanced filter mentioned in my answer. – yyyyahir Jul 22 '19 at 09:06
  • Thanks, but what does the ConfigMap say in your Istio on GKE install? – MrMage Jul 22 '19 at 09:30
  • [Looks like this](https://gist.github.com/yyyyahir/8103ebe04572a1c70ca30a9e420cecc3) for the `mesh` data. – yyyyahir Jul 22 '19 at 10:35
  • 1
    just a heads up that like MrMage, our install of istio on GKE has accessLogFile: "" and the issue is that when fixing it, it gets overwritten within minutes. Looking for a way to change it that sticks(since it's not a helm install there's nothing we can do there) – juhanic Jul 23 '19 at 03:12
  • I've just retried with the same positive results. I launched the cluster using `gcloud beta container clusters create istio3 --zone us-central1-a --num-nodes=5 --addons Istio --istio-config auth=MTLS_PERMISSIVE --async`. Istio version is labeled as `1.0.6`. If this doesn't work, I suggest you reach out to the GCP support team. – yyyyahir Jul 23 '19 at 08:22
  • In Istio 1.1 the default is to disable access logs and google does not permit modifying the configuration. This is why it works for folks running the older Istio 1.0 and not on folks running newer clusters with Istio 1.1 At this point we're looking at moving off the managed Istio, as they seem incredibly slow to keep up (current release is 1.4) – ramblingpolak Nov 19 '19 at 01:18