I need to add a config file to driver spark classpath on google dataproc.
I have try to use --files
option of gcloud dataproc jobs submit spark
but this not work.
Is there a way to do it on google dataproc?
I need to add a config file to driver spark classpath on google dataproc.
I have try to use --files
option of gcloud dataproc jobs submit spark
but this not work.
Is there a way to do it on google dataproc?
In Dataproc, anything listed as a --jar will be added to the classpath and anything listed as a --file will be made available in each spark executor's working directory. Even though the flag is --jars, it should be safe to put non-jar entries in this list if you require the file to be on the classpath.
I know, I am answering too late. Posting for new visitors.
One can execute this using cloud shell. Have tested this.
gcloud dataproc jobs submit spark --properties spark.dynamicAllocation.enabled=false --cluster=<cluster_name> --class com.test.PropertiesFileAccess --region=<CLUSTER_REGION> --files gs://<BUCKET>/prod.predleads.properties --jars gs://<BUCKET>/snowflake-common-3.1.34.jar