0

We are migrating 2 Spark Streaming jobs using Structured Streaming from on-prem to GCP.

One of them stream messages from Kafka and saves in GCS. And the other, stream from GCS and save in BigQuery.

Sometimes this jobs fails because of any problem, for example: (OutOfMemoryError, Connection reset by peer, Java heap space, etc).

When we get an Exception in on-prem environment, YARN marks the job as FAILLED and we have a scheduler flow that will rise the job again.

In GCP, we developed the same flow, that will rise the job again when fails. But when we get an Exception in DataProc, YARN marks the job as SUCCEEDED and DataProc remain with the status RUNNING.

You can see in this image the log with StreamingQueryException and the status of the job is Running ("Em execução" is running in Portuguese).

Dataproc job

0 Answers0