While it's possible to provide your own generated jobid when using the underlying REST API, there isn't currently any way to specify your own jobid when submitting with gcloud dataproc jobs submit
; this feature might be added in the future. That said, typically when people want to specify job ids they also want to be able to list with more complex match expressions, or potentially to have multiple categories of jobs listed by different kinds of expressions at different points in time.
So, you might want to consider dataproc labels instead; labels are intended specifically for this kind of use case, and are optimized for efficient lookup. For example:
gcloud dataproc jobs submit pig --labels jobtype=mylogspipeline,date=20170508 ...
gcloud dataproc jobs submit pig --labels jobtype=mylogspipeline,date=20170509 ...
gcloud dataproc jobs submit pig --labels jobtype=mlpipeline,date=20170509 ...
gcloud dataproc jobs list --filter "labels.jobtype=mylogspipeline"
gcloud dataproc jobs list --filter "labels.date=20170509"
gcloud dataproc jobs list --filter "labels.date=20170509 AND labels.jobtype=mlpipeline"