7

How can I programmatically shutdown a Google Dataproc cluster automatically after all jobs have completed?

Dataproc provides creation, monitoring and management. But it seems I cannot find out how to delete the cluster.

Igor Dvorzhak
  • 4,360
  • 3
  • 17
  • 31
Sreenath Chothar
  • 173
  • 2
  • 13

6 Answers6

7

The gcloud dataproc CLI interface offers the max-idle option. This automatically kills the Dataproc cluster after an x amount of inactivity (i.e. no running jobs). It can be used as follows:

gcloud dataproc clusters create test-cluster \
    --project my-test-project \
    --zone europe-west1-b \
    --master-machine-type n1-standard-4 \
    --master-boot-disk-size 100 \
    --num-workers 2 \
    --worker-machine-type n1-standard-4 \
    --worker-boot-disk-size 100 \
    --max-idle 1h
1

It depends on the language. Personally, I use Python (pyspark) and the code provided here worked fine for me:

https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/dataproc/submit_job_to_cluster.py

You may need to adapt the code to your purpose and follow the prerequisite steps specified in the README file (https://github.com/GoogleCloudPlatform/python-docs-samples/tree/master/dataproc), like enabling the API and installing the packages in requirements.txt .

Basically, using the function wait_for_job you wait until the job has finished, and with delete_cluster , as the name says, the cluster that you have previously created get deleted. I hope this can help you.

1

To achieve this goal you have three options:

  1. Set --max-idle property during cluster creation (see https://stackoverflow.com/a/54239034/3227693).

  2. Use Dataproc Workflow Templates to manage cluster lifecycle. It can automatically create cluster to execute jobs on and delete cluster after all jobs are finished.

  3. Use full-blown orchestration solution as Cloud Composer to manage your clusters and jobs lifecycle.

Igor Dvorzhak
  • 4,360
  • 3
  • 17
  • 31
0

There a couple of programmable ways to auto shut down the cluster:

  1. Call the REST api directly
  2. Use the gcloud CLI

Either of these could be used (called) after your job(s) finish executing.

See more here: https://cloud.google.com/dataproc/docs/guides/manage-cluster#delete_a_cluster

Graham Polley
  • 14,393
  • 4
  • 44
  • 80
  • I want to completely automate this task. So here how will we get notified when job finish the execution? Once we get job completion callback/notification, REST api could be used to delete the cluster. – Sreenath Chothar May 09 '17 at 06:55
  • 1
    Again, use the REST api. Specically, the `GET` on the job resource and wrap it in a polling loop - https://cloud.google.com/dataproc/docs/reference/rest/v1/projects.regions.jobs/get. So, submit > monitor > shutdown – Graham Polley May 09 '17 at 12:24
  • Ok. So the external script has to poll the job status and then based on the status,fire different actions on the cluster. Any tools/third-party software which manage the DataProc cluster with auto shutdown and scaling capabilities? Because same problem exists with auto-scaling also. DataFlow handles auto-scaling by itself. – Sreenath Chothar May 11 '17 at 11:05
  • I don't know of any 3rd party tool. You'd need to hand-roll something yourself. – Graham Polley May 11 '17 at 12:08
  • Same way could we monitor the cluster health and scale up/down using REST apis? – Sreenath Chothar May 11 '17 at 12:15
  • Yes, use the API to change num workers - https://cloud.google.com/dataproc/docs/reference/rest/v1/projects.regions.clusters/patch – Graham Polley May 11 '17 at 12:19
0

You can do that with Scala code:

  • create cluster
  • run all job
  • when jobs terminated Delete cluster

To do that you can work with Scala Future.

If you have many jobs you can run them in parallel:

val gcpJarBucket = "gs://test_dataproc/dataproc/Dataproc.jar"
val jobs = Seq("package.class1", "package.class2")
val projectName: String = "automat-dataproc"
val clusterName: String = "your-cluster-name"

val timeout = 180 minute

// Working directory
implicit val wd = pwd

val future = Future {
  println("Creating the spark cluster...")
  % gcloud ("dataproc", "clusters", "create", clusterName, "--subnet", "default", "--zone", "europe-west1-b", "--master-machine-type", "n1-standard-4", "--master-boot-disk-size", "50", "--num-workers", "3", "--worker-machine-type", "n1-standard-4", "--worker-boot-disk-size", "50", "--project", projectName)
  println("Creating the spark cluster...DONE")
}.flatMap { _ =>
  {
    Future.sequence {
      jobs.map { jobClass =>
        Future {
          println(s"Launching the spark job from the class $jobClass...")
          % gcloud ("dataproc", "jobs", "submit", "spark", s"--cluster=$clusterName", s"--class=$jobClass", "--region=global", s"--jars=$gcpJarBucket")
          println(s"Launching the spark job from the class $jobClass...DONE")
        }
      }
    }

  }
}

Try { Await.ready(future, timeout) }.recover { case exp => println(exp) }
% bash ("-c", s"printf 'Y\n' | gcloud dataproc clusters delete $clusterName")
Igor Dvorzhak
  • 4,360
  • 3
  • 17
  • 31
G.Saleh
  • 509
  • 1
  • 11
  • 29
0

You can delete cluster when spark application finish. here are some examples:

private SparkApplication(String[] args) throws
                                        org.apache.commons.cli.ParseException,
                                        IOException,
                                        InterruptedException {

    // Your spark code here

    if (profile != null && profile.equals("gcp")) {
        DataProcUtil.deleteCluster(clusterName);
    }
}

And here is how you delete your cluster by java

 public static void deleteCluster(String clusterName) throws IOException, InterruptedException {

    logger.info("Try to delete cluster: {}....", clusterName);

    Process process = new ProcessBuilder("gcloud",
                                         "dataproc",
                                         "clusters",
                                         "delete",
                                         clusterName,
                                         "--async",
                                         "--quiet").start();

    int errCode = process.waitFor();
    boolean hasError = (errCode == 0 ? false : true);
    logger.info("Command executed, any errors? {}", hasError);
    String output;
    if (hasError) {
        output = output(process.getErrorStream());
    }
    else {
        output = output(process.getInputStream());
    }

    logger.info("Output: {}", output);

}

private static String output(InputStream inputStream) throws IOException {
    StringBuilder sb = new StringBuilder();

    try (BufferedReader br = new BufferedReader(new InputStreamReader(inputStream))) {

        String line;
        while ((line = br.readLine()) != null) {

            sb.append(line)
              .append(System.getProperty("line.separator"));

        }
    }
    return sb.toString();

}
howie
  • 2,587
  • 3
  • 27
  • 43