1

I am trying to read a CSV file in a Google Cloud bucket locally using Pandas. I have logged in using gcloud auth login and have configured the project. However, when I try to read the CSV file using df = pd.read_csv(f"gs://mybucket/myfolder/mycsv.csv") I get a 401 error:

Anonymous caller does not have storage.objects.get access to the Google Cloud Storage object., 401

I was wondering what further steps should I take so I can directly read the csv file? I have checked the gcloud config and my account is listed there.

ashes999
  • 1,234
  • 1
  • 12
  • 36
  • It's unclear but I suspect you need to authenticate. The best way to do this is to use Application Default Credentials. See: https://cloud.google.com/docs/authentication/production#automatically – DazWilkin May 31 '21 at 22:26
  • ... And you'll want to ensure the account has, at least `roles/storage.objectViewer` or `roles/storage.objectAdmin`. See: https://cloud.google.com/docs/authentication/production#automatically – DazWilkin May 31 '21 at 22:40

1 Answers1

1

The problem is that the credentials established by gcloud auth login will not be picked up by your code. Please, see this great SO question and related answer for an in-deep explatation.

As suggested in the above-mentioned question, you can use gcloud auth application-default login instead.

As suggested in the SDK documentation, you can set the value of the GOOGLE_APPLICATION_CREDENTIALS environment variable as well.

Finally, you can initialize your storage client by providing explicit credentials: please, see the relevant documentation here; this documentation provides in addition a great summary of all the mentioned authentication options.

jccampanero
  • 50,989
  • 3
  • 20
  • 49