1

I am deploying containers to GKE that contain Python apps and encountering an error when I try to use OpenCensus to send trace messages:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/opencensus/metrics/transport.py", line 59, in func
    return self.func(*aa, **kw)
  File "/usr/local/lib/python3.7/site-packages/opencensus/metrics/transport.py", line 113, in export_all
    export(itertools.chain(*all_gets))
  File "/usr/local/lib/python3.7/site-packages/opencensus/ext/stackdriver/stats_exporter/__init__.py", line 162, in export_metrics
    self.client.project_path(self.options.project_id), ts_batch)
  File "/usr/local/lib/python3.7/site-packages/google/cloud/monitoring_v3/gapic/metric_service_client.py", line 1024, in create_time_series
    request, retry=retry, timeout=timeout, metadata=metadata
  File "/usr/local/lib/python3.7/site-packages/google/api_core/gapic_v1/method.py", line 143, in __call__
    return wrapped_func(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/google/api_core/retry.py", line 273, in retry_wrapped_func
    on_error=on_error,
  File "/usr/local/lib/python3.7/site-packages/google/api_core/retry.py", line 182, in retry_target
    return target()
  File "/usr/local/lib/python3.7/site-packages/google/api_core/timeout.py", line 214, in func_with_timeout
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/google/api_core/grpc_helpers.py", line 59, in error_remapped_callable
    six.raise_from(exceptions.from_grpc_error(exc), exc)
  File "<string>", line 3, in raise_from
google.api_core.exceptions.InvalidArgument: 400 One or more TimeSeries could not be written: The set of resource labels is incomplete. Missing labels: (container_name namespace_name).: timeSeries[0-199]

The interesting part seems to be this sentence: Missing labels: (container_name namespace_name).

When I run the exact same code locally, I do not receive any errors and I do see my tracing appearing in Stackdriver Metrics Explorer, so the problem appears to be related specifically to running inside a container in GKE.

Is there something specific that is required to get OpenCensus working in a GKE container?

seawolf
  • 2,147
  • 3
  • 20
  • 37
  • Does it work locally containerised? I suspect (hope) not. You need to provide the container with your service's credentials. Have you mounted the Google Cloud Platform service account into Kubernetes? – DazWilkin Sep 28 '19 at 19:46
  • This is boilerplate for accessing any GCP service from within a Kubernetes Engine pod using a GCP service account: https://cloud.google.com/kubernetes-engine/docs/tutorials/authenticating-to-cloud-platform#step_4_import_credentials_as_a_secret – DazWilkin Sep 28 '19 at 19:48
  • @DazWilkin I appreciate the suggestions, but this is not about not having a service account: the service is currently successfully using Stackdriver for logging, and is also using Firestore, Memorystore, and PubSub without issues, so credentials are clearly allowing access to things. Is there a *particular* role that must be assigned to the service account that might be the culprit? – seawolf Oct 01 '19 at 17:45
  • On further reflection, I believe credentials can be entirely eliminated. As you can see in the output, I am able to successfully reach the service. The error is about an API call that is missing data, not about a call that is rejected for access reasons. – seawolf Oct 01 '19 at 17:46
  • Interesting. See: https://github.com/census-instrumentation/opencensus-python/issues/647 – DazWilkin Oct 01 '19 at 18:39
  • Yeah, that's me commenting at the end and asking when this was merged. This appears to be a regression, but I'm not sure. – seawolf Oct 01 '19 at 19:48
  • Opened the issue in the OpenCensus as a probable regression: https://github.com/census-instrumentation/opencensus-python/issues/796 – seawolf Oct 01 '19 at 20:10

1 Answers1

1

The answer is that you need to manually set two environment variables in your container: CONTAINER_NAME and NAMESPACE. I believe GKE should be setting these and isn't, and so OpenCensus can't find the expected values. A sample fix would involve including those two variables in the podspec:

        spec:
          containers:
            env:
            - name: NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
            - name: CONTAINER_NAME
              value: {{ APP }}-collectors-{{ NAME }}

More details: https://github.com/census-instrumentation/opencensus-python/issues/796#issuecomment-539109321

seawolf
  • 2,147
  • 3
  • 20
  • 37