Grant user permission to bigquery.datasets.create in schedules notebook in Vertex AI

Question

I have a notebook in which I access data through APIs, work with the data and send the results to a BigQuery table as well as to a GCS bucket. Everything works as it should when the notebook is run manually.

However, when scheduled, it breaks with the following message:

/opt/conda/lib/python3.7/site-packages/pandas/core/frame.py in to_gbq(self, destination_table, project_id, chunksize, reauth, if_exists, auth_local_webserver, table_schema, location, progress_bar, credentials)
   1938             location=location,
   1939             progress_bar=progress_bar,
-> 1940             credentials=credentials,
   1941         )
   1942 

/opt/conda/lib/python3.7/site-packages/pandas/io/gbq.py in to_gbq(dataframe, destination_table, project_id, chunksize, reauth, if_exists, auth_local_webserver, table_schema, location, progress_bar, credentials)
    221         location=location,
    222         progress_bar=progress_bar,
--> 223         credentials=credentials,
    224     )

/opt/conda/lib/python3.7/site-packages/pandas_gbq/gbq.py in to_gbq(dataframe, destination_table, project_id, chunksize, reauth, if_exists, auth_local_webserver, table_schema, location, progress_bar, credentials, api_method, verbose, private_key)
   1155             credentials=connector.credentials,
   1156         )
-> 1157         table_connector.create(table_id, table_schema)
   1158     else:
   1159         original_schema = pandas_gbq.schema.to_pandas_gbq(table.schema)

/opt/conda/lib/python3.7/site-packages/pandas_gbq/gbq.py in create(self, table_id, schema)
   1311             _Dataset(
   1312                 self.project_id, credentials=self.credentials, location=self.location,
-> 1313             ).create(self.dataset_id)
   1314 
   1315         table_ref = TableReference(

/opt/conda/lib/python3.7/site-packages/pandas_gbq/gbq.py in create(self, dataset_id)
   1413             self.client.create_dataset(dataset)
   1414         except self.http_error as ex:
-> 1415             self.process_http_error(ex)

/opt/conda/lib/python3.7/site-packages/pandas_gbq/gbq.py in process_http_error(ex)
    384             raise QueryTimeout("Reason: {0}".format(ex))
    385 
--> 386         raise GenericGBQException("Reason: {0}".format(ex))
    387 
    388     def download_table(

GenericGBQException: Reason: 403 POST https://bigquery.googleapis.com/bigquery/v2/projects/****************/datasets?prettyPrint=false: Access Denied: Project ******************: User does not have bigquery.datasets.create permission in project *****************.

(where the stars are the project-id (btw: this is not a project-id I have chosen).)

for the following part of the script:

Table_grouped_to_bg.to_gbq('retailer_accuracy.testbin', Context.default().project_id,chunksize=10000,if_exists='append') #
                    Table_grouped_to_bg.to_csv('gs://retailer-sectionalized-labels-csv/'+'Table_grouped_' +str(Store_name)+'_'+str(Date)+'.csv',sep=';' ,encoding='utf-8-sig')
                    ax1 =Table_grouped.plot.bar(x="distance_bin", y="percentage", rot=70, title="Percentage in each bin,{}".format(Date), figsize=(15, 5));
                    x_offset = -0.01
                    y_offset = 0.02
                    for p in ax1.patches:
                        b = p.get_bbox()
                        val = "{:+.2f}".format(b.y1 + b.y0)        
                        ax1.annotate(val, ((b.x0 + b.x1)/2 + x_offset, b.y1 + y_offset))

                    fig = ax1.get_figure() 
                    
                    
                    def saving_figure(path_logdir):
                        fig = ax1.get_figure() 
                        fig_to_upload = plt.gcf()

                        # Save figure image to a bytes buffer
                        buf = io.BytesIO()
                        fig_to_upload.savefig(buf, format='png')
                        buf.seek(0)
                        image_as_a_string = base64.b64encode(buf.read())

                        # init GCS client and upload buffer contents
                        client = Storage.Client()
                        bucket = client.get_bucket('retailer-sectionalized-statistics-plot-png')
                        blob = bucket.blob('Accuracy_barplot_'+str(Store_name)+'_'+str(Date)+'.png')  # This defines the path where the file will be stored in the bucket
                        your_file_contents = blob.upload_from_string(image_as_a_string, content_type='image/png')
                    plt.show(block=True)
                    plt.show(block=True) 
....................saving_figure('Accuracy_barplot_'+str(Store_name)+'_'+str(Date)+'.png')

where the stars are the project-id.

I understand that it is no longer I who is running the project when it is scheduled. The question is then, how do I pass the permission for the notebook to be run by a GCP-instance?

You can create a service account in that project, assign your user id to the service account and grant [BigQuery Data Editor](https://cloud.google.com/bigquery/docs/access-control#bigquery) and [Notebook Admin](https://cloud.google.com/vertex-ai/docs/workbench/user-managed/iam#iam_roles) role to the service account.For more information, you can check this [documentation](https://cloud.google.com/vertex-ai/docs/general/access-control). Let me know if that is helpful. — Shipra Sarkar, Apr 30 '22 at 09:22
@ShipraSarkar Thanks. I will try that and will get back to you. — Serge de Gosson de Varennes, Apr 30 '22 at 10:11

score 1 · Accepted Answer · answered Apr 30 '22 at 13:15

The issue is related to permission error. For your requirement, you can create a service account in the project. Service Accounts are special types of accounts that are used to call API’s and used as identification of the application. You can assign your user id to the service account and grant BigQuery Data Editor and Notebook Admin role to the service account.For more information, you can check this documentation.

Grant user permission to bigquery.datasets.create in schedules notebook in Vertex AI

1 Answers1