I was trying to profile some Spark jobs and I want to collect Java Flight Recorder(JFR) files from each executor. I am running my job on a YARN cluster with several nodes, so I cannot manually collect JRF file for each run. I want to write a script which can collect JFR file from each node in cluster for a given job.
MR provides a way to name JFR files generated by each task with taskId. It replaces '@task@' with TaskId in Java opts. With this I can get a unique name for JFR files created by each task and the since TaskId also has JobId, I can parse it to distinguish files generated by different MR jobs.
I am wondering, if Spark has something similar. Does Spark provides a way to determine executorId in Java opts? Has anyone else has tried to do something similar and found a better way collect all JFR files for a Spark job?