0

I want to measure flinks performance with performance counters (perf). My code:

var text = env.readTextFile("<filename>")
var counts = text.flatMap { _.toLowerCase.split("\\W+") }.map { (_, 1) }.groupBy(0).sum(1)
counts.writeAsText("<filename_result>", WriteMode.OVERWRITE)
env.execute()

I know the PID of the jobmanager. Also I can see the TID of the Thread (CHAIN DataSource), that runs the execute()-command, during execution. But for each execution the TID changes, so it wont work with the TID. Is there a way to figure out the PID of the jobmanagers child process, that runs the execute()-command? And are there different child processes for every transformation (e.g. flatMap) of the rdd? If so, is it possible to find out their distinct PIDs?

Matthias J. Sax
  • 59,682
  • 7
  • 117
  • 137
lary
  • 399
  • 2
  • 14

1 Answers1

4

The individual operators are not executed in distinct processes. The JobManager and the TaskManagers are started as Java processes. The TaskManager then runs a set of parallel tasks (corresponding to the operators). Each parallel task is executed in its own thread. When you start Flink, then the system will create files /tmp/your-name-taskmanager.pid and /tmp/your-name-jobmanager.pid which contain the PID of the processes.

Till Rohrmann
  • 13,148
  • 1
  • 25
  • 51
  • In the tmp directory only a /tmp/my-name-jobmanager.pid exists and it contains only the PID of the JobManager. How I start Flink: first I start a cluster with /bin/start-cluster.sh (or /bin/start-local.sh) and then I connect a Flink Shell to it. Am I doing something wrong? – lary Oct 30 '15 at 17:06
  • If your `TaskManager` is running on a different machine, then the PID file will be stored on this machine. If you start a local cluster, e.g. using `start-local.sh`, then Flink will only start a single process in which the `JobManager` and a single `TaskManager` runs. If you start the Flink shell without having explicitly started a cluster, then it will start for you a local cluster. – Till Rohrmann Oct 30 '15 at 17:23
  • I start Flink on only one machine, even with start-cluster.sh. So the jobmanager and the taskmanager should be on the same machine, right? – lary Oct 30 '15 at 17:26
  • Yes, you can verify that everything worked by calling `jps`. There you should see a `JobManager` and a `TaskManager` process. If not, then you should look in the `logs/your-hostname-taskmanager.log` file to see what went wrong. – Till Rohrmann Oct 30 '15 at 17:31
  • Ok, now it worked, the taskmanager file in the tmp directory exists. Thanks – lary Oct 30 '15 at 17:59