0

I am having trouble in running a databricks notebook ( scala) , And I see the job is having high write shuffle size. and it already run over an hour. Let's have a look on the following screen enter image description here

Any idea on checking how why ?

shuffle write: 35.5GB/ 1796240509 what's the meaning of 35.5GB and 1796240509 ??

mytabi
  • 639
  • 2
  • 12
  • 28
  • 1796240509 is the number of records whereas Shuffle Write is the sum of all written serialized data on all executors before transmitting. https://stackoverflow.com/questions/27276884/what-is-shuffle-read-shuffle-write-in-apache-spark – anshul_cached Aug 20 '19 at 07:48
  • number of what records ??? – mytabi Aug 20 '19 at 08:53

0 Answers0