I have a Java program in which I use Spark
as a runner for beam pipeline.
There is a Spark
task that collects some data. It got finished correctly but, after that, its worker died and this task got assigned to another worker.
Why doesn't it recover the data that's already collected?
What's the best way to recover this data ?
I tried using shuffle service
but still faced the same problem. Any advice is welcome.