0

I'm using giraph-1.3 built with yarn profile. For starting I configured 1 namenode and 2 datanodes on a ec2 cluster. My application properly works because I see expected output in logs (and in output directory). I launched giraph with "-w 2" argument because I have two datanodes.

In userlogs of datanode1 I found log of first worker.
in userlogs of datanode2 I found log of second worker and log of master too.

I expected to find log of master in the namenode i.e. I expected that master runs on namenode. Is it right?

Maybe I have to configure another datanode and then I will find master logs on this new datanode?

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245

2 Answers2

0

I understood that hadoop/giraph works creating containers on datanodes. Hadoop creates a container for application master, then giraph creates a container for the master. Furthermore giraph creates a number of container for workers corresponding to -w parameter.

0

YARN always creates an Application Master for every job.

You can start as many "workers" as you want, depending on your workload, but since you only have 2 datanodes, you can only have 2 NodeManagers for maximum parallelism

A NodeManager has a maximum memory space available to it, and the YARN containers for the tasks of a job get a subsection of that in order to do processing.

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245