0

I was trying to profile some Spark jobs and I want to collect Java Flight Recorder(JFR) files from each executor. I am running my job on a YARN cluster with several nodes, so I cannot manually collect JRF file for each run. I want to write a script which can collect JFR file from each node in cluster for a given job.

MR provides a way to name JFR files generated by each task with taskId. It replaces '@task@' with TaskId in Java opts. With this I can get a unique name for JFR files created by each task and the since TaskId also has JobId, I can parse it to distinguish files generated by different MR jobs.

I am wondering, if Spark has something similar. Does Spark provides a way to determine executorId in Java opts? Has anyone else has tried to do something similar and found a better way collect all JFR files for a Spark job?

2 Answers2

0

You can't set an executor id in the opts, but you can get the executor Id from the event log, as well as the slave node bearing it.

However I believe the option you give to spark-submit for a yarn master and a standalone one have the same effect on executors JVM, so you should be fine!

Bacon
  • 1,814
  • 3
  • 21
  • 36
0

You can use {{EXECUTOR_ID}} and {{APP_ID}} placeholders in spark.executor.extraJavaOptions parameter. They will be replaced by Spark with executor's ID and application's ID, respectively.

rlyzwa
  • 63
  • 1
  • 5