0

While doing spark-submit, Gcloud gives an option to use --properties-file to pass the cluster properties and spark configurations. I am not sure how to use it while running the job.

Oli
  • 9,766
  • 5
  • 25
  • 46

1 Answers1

0

Create a .txt file with any name. The content inside this file is line separated as shown below.

spark.hadoop.hive.metastore.uris=ip1,ip2,ip3
spark.submit.deployMode=cluster
spark.yarn.appMasterEnv.PYTHONPATH=some_path
spark.executorEnv.PYTHONPATH=some_path

In your spark-submit pass this .txt file as below:

gcloud dataproc jobs submit pyspark --cluster=<cluster_name> --region=<region_name> main.py --properties-file <path to .txt file>