0

I am able to create a google dataproc cluster from the command line using a custom image:

gcloud beta dataproc clusters create cluster-name --image=custom-image-name

as specified in https://cloud.google.com/dataproc/docs/guides/dataproc-images, but I am unable to find information about how to do the same using the v1beta2 REST api in order to create a cluster from within airflow. Any help would be greatly appreciated.

Georges Kohnen
  • 170
  • 1
  • 10
  • Hi Georges, you can take a look at this url: https://cloud.google.com/dataproc/docs/reference/rest/v1beta2/projects.regions.clusters/create – Hackerman Apr 19 '18 at 20:36
  • Looks like that interface does not know the "image" parameter (yet)? – Georges Kohnen Apr 19 '18 at 20:41
  • In the request body, you can build something like `{ "clusterName": "", "config": { "softwareConfig": { "imageVersion": "" } } }`...it seems that `imageVersion` is the right one. – Hackerman Apr 19 '18 at 20:44
  • I think 'imageVersion' actually refers to the Dataproc version (https://cloud.google.com/dataproc/docs/concepts/versioning/dataproc-versions), not a custom image (which is a beta feature) – Georges Kohnen Apr 19 '18 at 20:52
  • You're looking for "imageUri" on masterConfig and workerConfig objects: https://cloud.google.com/dataproc/docs/reference/rest/v1beta2/ClusterConfig#InstanceGroupConfig – tix Apr 19 '18 at 20:53
  • Thanks, that seems to be the right way to go. Still getting an "Unknown Error" (Http 500) though. – Georges Kohnen Apr 19 '18 at 21:17

1 Answers1

1

Since custom images can theoretically reside in a different project if you grant read/use access of that custom image to whatever project service account you use for the Dataproc cluster, images currently always need a full URI, not just a short name.

When you use gcloud, there's syntactic sugar where gcloud will resolve the full URI automatically; you can see this in action if you use --log-http with your gcloud command:

gcloud beta dataproc clusters create foo --image=custom-image-name --log-http

If you created one with gcloud you can also gcloud dataproc clusters describe your cluster to see the fully-resolved custom image URI.

Dennis Huo
  • 10,517
  • 27
  • 43