0

I am trying to create prototype, where I can share the resources among the projects to run a job within the google cloud platform

Motivation: Let say there are two projects: Project A and Project B. I want to use the dataproc cluster created in Project A to run a job in Project B. The project are within the same organisation in the GCP platform.

How do I do that?

β.εηοιτ.βε
  • 33,893
  • 13
  • 69
  • 83

1 Answers1

1

There are a few ways to manage resources across projects. Probably the most straightforward way to do this is to:

  1. Create a service account with appropriate permissions across your project(s).
  2. Setup an Airflow connection with the service account you have created.
  3. You can create workflows that use that connection and then specify the project when you create a Cloud Dataproc cluster.

Alternate ways you could do this that come to mind:

  1. Use something like the BashOperator or PythonOperator to execute Cloud SDK commands.
  2. Use an HTTP operator to ping the REST endpoints of the services you want to use

Having said that, the first approach using the operators is likely the easiest by far and would be the recommended way to do what you want.

With respect to Dataproc, when you create a job, it will only bind to clusters within a specific project. It's not possible to create jobs in one project against clusters in another. This is because things like logging, auditing, and other job-related semantics are messy when clusters live in another project.

James
  • 2,321
  • 14
  • 30
  • could i share a container registry amoung projects ? in case my need is to have a ci/cd pipeline that i would deploy in many env's, should i have project DEV, QA , PROD , or should i have the resources of DEV, QA, PROD under a project ? – Tiago Medici Oct 06 '20 at 13:41