8

I'm having trouble submitting an Apache Beam example from a local machine to our cloud platform.

Using gcloud auth list I can see that the correct account is currently active. I can use gsutil and the web client to interact with the file system. I can use the cloud shell to run pipelines through the python REPL.

But when I try and run the python wordcount example I get the following error:

IOError: Could not upload to GCS path gs://my_bucket/tmp: access denied.
Please verify that credentials are valid and that you have write access 
to the specified path.

Is there something I am missing with regards to the credentials?

RHolland
  • 221
  • 3
  • 7
  • Do you have a bucket named `my_bucket` that does *not* contain a folder or file `tmp`? – Mitch Lillie May 25 '17 at 15:43
  • `my_bucket` exists, as does the directory `tmp` – RHolland May 26 '17 at 08:52
  • It might have to do with the access scopes granted to the VM. Please see https://stackoverflow.com/questions/27275063/gsutil-copy-returning-accessdeniedexception-403-insufficient-permission-from – jldupont Jun 30 '17 at 19:43

6 Answers6

18

Here are my two cents after spending the whole morning on the issue.

You should make sure that you login with gcloud on your local machine, however, pay attention to the warning message that return from gcloud auth login:

WARNING: `gcloud auth login` no longer writes application default credentials.

These credentials are required for the python code to identify your credentials properly.

Solution is rather simple, just use: gcloud auth application-default login

This will write a credentials file under: ~/.config/gcloud/application_default_credentials.json which is used for the authentication in the local development env.

odedfos
  • 4,491
  • 3
  • 30
  • 42
  • 2
    How could I miss your answer. **I confirm that is the solution. It worked form me.** arrgh I spent the afternoon looking for it by myself. – MBHA Phoenix Nov 09 '18 at 18:33
  • In my version of the CLI, `gcloud auth login` no longer outputs the warning message, so it's not easy to find this solution. See also: https://stackoverflow.com/questions/53306131/difference-between-gcloud-auth-application-default-login-and-gcloud-auth-logi. – Raman Nov 24 '20 at 16:24
  • spent some time on this issue, glad to find this answer – Akhil Apr 13 '21 at 12:52
2

You'll need to create a GCS bucket and folder for your project, then specify that as the pipeline parameter instead of using the default value.

https://cloud.google.com/storage/docs/creating-buckets

Alex Amato
  • 1,685
  • 10
  • 15
  • The bucket and directory exist. This bucket is then used for the `--output` and `--temp-location` pipeline parameters. Is there another parameter I am missing when running the wordcount example? – RHolland May 26 '17 at 10:37
  • Are you using a project that has permissions on the bucket? Is it the same project that owns the bucket? – Alex Amato May 31 '17 at 17:01
  • You can browse to the GCS folder in the Storage section. Select the project you are using and you should be able to see all the buckets visible to the project https://console.cloud.google.com/ – Alex Amato May 31 '17 at 17:03
  • I can see the bucket, I can create and delete directories, I can upload files to the bucket. This is while signed in to the same user as I see with `gcloud auth list`. – RHolland Jun 01 '17 at 09:53
  • How exactly do the credentials work? Does the `$GOOGLE_APPLICATION_CREDENTIALS` environment variable have anything to do with it or is it all handled through the `gcloud` command? – RHolland Jun 06 '17 at 09:34
2

Same Error Solved after creating a bucket.
gsutil mb gs://<bucket-name-from-the-error>/

1

I have faced the same issue where it throws up the IO error. Things that helped me here are (not in the order):

  1. Checking the Name of the bucket. This step helped me a lot. Bucket names are global. If you make mistake in the bucket-name while accessing your bucket then you might be accessing buckets that you have NOT created and you don't have permission to.

  2. Checking the service account that you have filled in:

    • export GOOGLE_CLOUD_PROJECT= yourkeyfile.json

Activating the service account for the key file you have plugged in -

gcloud auth activate-service-account --key-file=your-key-file.json 

Also, listing out the auth accounts available might help you too.

gcloud auth list
mohammed_ayaz
  • 620
  • 11
  • 16
1

One solution might work for you. It did for me.

In the cloud shell window, click on "Launch code Editor" (The Pencil Icon). The editor will work in Chrome (not sure about Firefox), it did not work in Brave browser.

Now, browse to your code file [in the launched code editor on GCP] (.py or .java) and locate the pre-defined PROJECT and BUCKET names and replace the name with your own Project and Bucket names and save it.

Now execute the file and it should work now.

Dev Ranjan
  • 25
  • 8
0

Python doesn't use gcloud auth to authenticate but it uses the environment variable GOOGLE_APPLICATION_CREDENTIALS. So before you run the python command to launch the Dataflow job, you will need to set that environment variable:

export GOOGLE_APPLICATION_CREDENTIALS="/path/to/key"

More info on setting up the environment variable: https://cloud.google.com/docs/authentication/getting-started#setting_the_environment_variable

Then you'll have to make sure that the account you set up has the necessary permissions in your GCP project.

Permissions and service accounts:

mellybear
  • 13
  • 3