2

I am trying to create a cluster in Dataproc using google-cloud-python library, however, when setting region = 'us-central1' I get below exception:

google.api_core.exceptions.InvalidArgument: 400 Region 'us-central1' is invalid.
Please see https://cloud.google.com/dataproc/docs/concepts/regional-endpoints
for additional information on regional endpoints

Code (based on example):

#!/usr/bin/python

from google.cloud import dataproc_v1

client = dataproc_v1.ClusterControllerClient()

project_id = 'my-project'
region = 'us-central1'
cluster = {...}

response = client.create_cluster(project_id, region, cluster)
tix
  • 2,138
  • 11
  • 18

3 Answers3

4

Dataproc uses region field for routing REST requests, however, the field is not used in gRPC clients (hence the error).

Only the global multiregion can be accessed through the default endpoint. To use a regional endpoint such as us-central1, you have to configure the endpoint to address on the client's transport.

The Dataproc regional endpoints follow this pattern: <region>-dataproc.googleapis.com:443. The region field should be set to the same value as the region in the endpoint.

Example:

#!/usr/bin/python

from google.cloud import dataproc_v1
from google.cloud.dataproc_v1.gapic.transports import cluster_controller_grpc_transport

transport = cluster_controller_grpc_transport.ClusterControllerGrpcTransport(
    address='us-central1-dataproc.googleapis.com:443')
client = dataproc_v1.ClusterControllerClient(transport)

project_id = 'my-project'
region = 'us-central1'
cluster = {...}

response = client.create_cluster(project_id, region, cluster)
tix
  • 2,138
  • 11
  • 18
  • [The related issue](https://github.com/googleapis/google-cloud-python/issues/5884#issuecomment-494879727) was raised in **google-cloud-python** repository, so probably they will make it easier to manage someday. – GoodDok Sep 18 '19 at 09:04
1

As for now, the recommended way to change the default API endpoint is to use client_options:

client_options (Union[dict, google.api_core.client_options.ClientOptions]) – Client options used to set user options on the client. API Endpoint should be set through client_options.

Here's an example with loading credentials from json file (Python 3.6+ syntax with f-string):

from google.cloud.dataproc_v1 import ClusterControllerClient


client = ClusterControllerClient.from_service_account_file(
             service_account_json_path,
             client_options={'api_endpoint': f'{your_region}-dataproc.googleapis.com:443'})
GoodDok
  • 1,770
  • 13
  • 28
0

Similarly using the google-cloud-java client:

ClusterControllerSettings settings =
     ClusterControllerSettings.newBuilder()
        .setEndpoint("us-central1-dataproc.googleapis.com:443")
        .build();
 try (ClusterControllerClient clusterControllerClient = ClusterControllerClient.create(settings)) {
   String projectId = "my-project";
   String region = "us-central1";
   Cluster cluster = Cluster.newBuilder().build();
   Cluster response =
       clusterControllerClient.createClusterAsync(projectId, region, cluster).get();
 }
tix
  • 2,138
  • 11
  • 18