0

Sometimes a continuously running DLT pipeline job will lose connection with the driver when it autoscales (enhanced autoscaling is enabled). The exact error message is:

INTERNAL_ERROR: Communication lost with driver. Cluster xyz was not reachable for 120 seconds

Is it possible to adjust the timeout duration or fix it in some other way?

Alex Ott
  • 80,552
  • 8
  • 87
  • 132
kyrre
  • 626
  • 2
  • 9
  • 24
  • 1
    I suggest to open a support ticket for that - without logs, code, etc. it's hard to say what could be a reason – Alex Ott Aug 04 '23 at 11:26
  • Yes, agree with Alex. For any kind of Internal error, its hard to see what was going on the backend side of things. But if i could make up a theory, if this only happens when you are upscaling, it could mean that more clusters are not being assigned as there is no clusters on the stand-by pool. (there are always some clusters running, waiting to be assigned to a driver of client) – Ziya Mert Karakas Aug 05 '23 at 14:57

0 Answers0