I have submitted a Hive job using Airflow's DataprocWorkflowTemplateInstantiateInlineOperator
to Dataproc cluster. When some of the jobs fail in googlecloud->dataproc->jobs I can see a link to the log with failure:
Google Cloud Dataproc Agent reports job failure. If logs are available, they can be found in 'gs://dataproc-abcde12-efghi23-jklmn12-uk/google-cloud-dataproc-metainfo/12354a681fgh161/jobs/job1-abdc12jssa/driveroutput'
Can I fetch this log link (e.g. gs://dataproc-abcde12-efghi23-jklmn12-uk/google-cloud-dataproc-metainfo/12354a681fgh161/jobs/job1-abdc12jssa/driveroutput
) through Airflow?
I checked gcp_dataproc_hook.py
operator for anything that points to a log link so that I can retrieve it, but didn't find anything useful.