I am using Apache Spark to read data from SQL Server to CSV with the below version details:
- implementation 'com.microsoft.azure:spark-mssql-connector_2.12:1.2.0'
- implementation 'org.apache.spark:spark-core_2.12:3.1.3'
- implementation group: 'org.apache.spark', name: 'spark-sql_2.12', version: '3.1.3'
Here each table data export to CSV is further splitted into muliple task through the below configurable options:
- "lowerBound"
- "upperBound"
- "numPartitions"
- "partitionColumn"
So assume if numPartition is 5, there will be 5 tasks under 1 job
Looking for help on below:
On each task completion, I need to do some task-specific operations (with some task-specific data), so is there any way to hook some listeners to each task or Job?
I know there is a way to hook the listener by extends SparkListener
but that can be hooked with the whole SparkContext, which can NOT do the task-specific operations.