0

Actually the document explicitly states:

When applying a property to a job, the file prefix is not used.

However, the example given there is inconsistent with this

This is what the page says:

...However, many of these properties can also be applied to specific jobs. When applying a property to a job, the file prefix is not used. The following example sets Spark executor memory to 4g for a Spark job (spark: prefix omitted).

gcloud dataproc jobs submit spark \
    --region=region \
    --properties=spark.executor.memory=4g \
    ... other args ...

Job properties can be submitted in a file using the gcloud dataproc jobs submit job-type --properties-file flag (see, for example, the --properties-file description for an Hadoop job submission).

gcloud dataproc jobs submit JOB_TYPE \
    --region=region \
    --properties-file=PROPERTIES_FILE \
    ... other args ...

The PROPERTIES_FILE is a set of line-delimited key=value pairs. The property to be set is the key, and the value to set the property to is the value. See the java.util.Properties class for a detailed description of the properties file format.

The following is an example of a properties file that can be passed to the --properties-file flag when submitting a Dataproc job.

dataproc:conda.env.config.uri=gs://some-bucket/environment.yaml
spark:spark.history.fs.logDirectory=gs://some-bucket
spark:spark.eventLog.dir=gs://some-bucket
capacity-scheduler:yarn.scheduler.capacity.root.adhoc.capacity=5

Above the file prefixes are used in job properties

figs_and_nuts
  • 4,870
  • 2
  • 31
  • 56

2 Answers2

1

No, property prefixes are only used for cluster properties and do not apply to job properties: https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/cluster-properties

Igor Dvorzhak
  • 4,360
  • 3
  • 17
  • 31
  • 1
    Yes I have been through the link you provided. Can you tell me how the last example given on the same page makes sense in that case. Specifically : https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/cluster-properties#cluster_vs_job_properties:~:text=The%20following%20is%20an%20example%20of%20a%20properties%20file%20that%20can%20be%20passed%20to%20the%20%2D%2Dproperties%2Dfile%20flag%20when%20submitting%20a%20Dataproc%20job. – figs_and_nuts Jun 21 '23 at 07:24
  • @figs_and_nuts this must be a bug in the doc, please report it via "Report error" functionality (button in the bottom right corner) on this doc page. – Igor Dvorzhak Jun 21 '23 at 15:48
1

As mentioned by @Igor, file prefixes are not used while applying property to a job. It is required on a cluster level when creating a cluster.

For the 2nd Dataproc job command mentioned ie.

gcloud dataproc jobs submit JOB_TYPE \
    --region=region \
    --properties-file=PROPERTIES_FILE \
    ... other args ...

this documentation gives details of using PROPERTIES_FILE which says that it specifies properties in the form of property=value in the text file, where property is the key and value is the value of that property ie. no file-prefix required.

Sakshi Gatyan
  • 1,903
  • 7
  • 13