1

Versions: Scala - 2.11, Spark: 2.4.4

To implement this, I have created my own implementation of SparkListener and added this during creating Spark session.

class SparkMetricListener extends SparkListener {
...
override def onTaskEnd .. {
..
//use taskEnd.taskMetric  to get recordsWritten count
}
}

This works fine in case my Target (dataframe.write) is either Hive or Parquet. And, able to get desired metrics/recordCount.

Problem is when we try to use this TaskEnd Listener metric for Spark Jdbc writer ( df.write.format("jdbc") ) --> This always returns ZERO as record written count.

Can anyone please assist if there is any other implementation of Listener that we can use to get Target count.

VimalK
  • 65
  • 1
  • 8

0 Answers0