10

I'm currently using GKE Workload Identity to access Google Cloud Platform resources from within GKE. This works very well for Google Cloud Storage and other platform resources.

However, I encounter an issue with "insufficient authentication scopes" when I try to use GKE Workload Identity for accessing a Google Sheet.

When I generate a key file for the service account and use this in my code, I can manually set the scope to https://www.googleapis.com/auth/spreadsheets. It works just as expected and I can access the sheet. If I change the scope to https://www.googleapis.com/auth/cloud-platform, I get the same error as with GKE Workload Identity, "insufficient authentication scopes". This result shows that the service account works just fine, so the issue seems to be related to the scope assigned with the GKE Workload Identity.

With GKE Workload Identity I retrieve the credentials in Python with credentials = google.auth.default() [1]. The credentials object has the expected service account and the scope is set to https://www.googleapis.com/auth/cloud-platform. I can now access buckets and other cloud resources the service account has access to. However, Google Sheets seems to require the https://www.googleapis.com/auth/spreadsheets scope, but I have not found any way to set this. The workload identity (service account) and scope is retrieved from the GKE meta data server running in the GKE cluster. From what I can tell, the scope for GKE Workload identity seems to be "hard coded" to https://www.googleapis.com/auth/cloud-platform. I have found no information on whether this can be changed.

(I tried to add the spreadsheet scope to the GKE node oauth scopes. No effect. And from what I can understand from the docs it should be unrelated.)

(And of course I can just use a key file to make this work, but the whole point with GKE Workload identity is exactly to avoid all the hassle with generating and distributing keys safely)

[1] User Guide — google-auth 1.6.2 documentation

andehen
  • 101
  • 3

2 Answers2

3

From the google-auth guide, are you setting the spreadsheet scope like this?

credentials, project = google.auth.default(
    scopes=['https://www.googleapis.com/auth/spreadsheets'])

I see some of the same behavior when using default clients, but when testing out with curl I do have some success using Workload Identity.

We can perform the flow using curl on a test pod (e.g. deploy an ubuntu pod and install curl). You should be able to verify on a GKE pod if scoped tokens function as you expect by curling the gke-metadata-server:

$ curl -H "Metadata-Flavor: Google" http://169.254.169.254/computeMetadata/v1/instance/service-accounts/default/token?scopes=https%3A//www.googleapis.com/auth/spreadsheets%20

We can then use the token returned in a request to the sheets API like so, assuming we have set the ACCESS_TOKEN and SPREADSHEET_ID environment variables:

$ curl -X GET -H "Authorization: Bearer $ACCESS_TOKEN" https://sheets.googleapis.com/v4/spreadsheets/$SPREADSHEET_ID

This will return all information about your sheet, instead of a 403 error.

I believe this is what client libraries are supposed to be doing under the hood, but it's possible there are some bugs here.


Here is a working example of a go app running on a GKE pod with Workload Identity configured (the service account has been granted viewer access to the sheet ID).

go.mod

module example.com/m
  
go 1.13

require (
        golang.org/x/oauth2 v0.0.0-20210514164344-f6687ab2804c
        google.golang.org/api v0.48.0
)

main.go:

package main
  
import (
        "fmt"
        "golang.org/x/oauth2"
        "golang.org/x/oauth2/google"
        "google.golang.org/api/sheets/v4"
)

func main() {
        client, err := google.DefaultClient(oauth2.NoContext, sheets.SpreadsheetsScope)
        if err != nil {
                panic(err)
        }
        srv, err := sheets.New(client)
        if err != nil {
                panic(err)
        }
        resp, err := srv.Spreadsheets.Values.Get("REPLACE_WITH_YOUR_SHEET_ID", "REPLACE_WITH_YOUR_RANGE").Do()
        if err != nil {
                panic(err)
        }
        fmt.Println(fmt.Sprintf("%+v", resp.Values))
}

FWIW I noticed using older versions of the oauth2 libraries definitely do not work with Workload Identity and scopes. Updating to use newer versions solved the problem.

deviavir
  • 195
  • 11
  • I tried your curl commands but I get `401: Request had invalid authentication credentials. Expected OAuth 2 access token, login cookie or other valid authentication credential.` – Rutger de Knijf Dec 23 '21 at 18:49
  • @RutgerdeKnijf which CURL command returned that? I assume you passed on the BEARER token returned by the first CURL command into the second? – deviavir Dec 27 '21 at 19:29
  • Yes, your assumption is correct. I passed the token on, but the request (to `sheets.googleapis.com`) returned that 401. – Rutger de Knijf Jan 03 '22 at 12:25
  • Updating the `google-api-client` library version fixed this issue for me (using Ruby) – Adam Feb 18 '22 at 19:52
1

Yes, you can. I cannot reproduce your problem because this (now) just works:

import google
from googleapiclient.discovery import build

SHEET_ID = '<your_sheet_id>'
RANGE = 'Sheet1!A:Z'

credentials, project_name = google.auth.default()

service = build('sheets', 'v4', credentials=credentials).spreadsheets()

result = service.values().get(spreadsheetId=SHEET_ID, range=RANGE).execute()

print(result)  # prints out the data in the sheet

This was tested on an autopilot cluster running v1.20.10-gke.1600 with WLI set on the default KSA and the corresponding GSA-email added as Viewer on the sheet.

Note that I didn't even set any scopes. You would think this would be required:

google.auth.default(scopes=['https://www.googleapis.com/auth/spreadsheets'])`

But it is completely ignored: credentials.scopes = None

Rutger de Knijf
  • 1,112
  • 14
  • 23