0

I am confused about the relationship between core instances and mappers each instance can have. How are these mappers created? If I set core instance count to 0, so that only master node is running, why can MapReduce jobs run without any task nodes?

Thanks in advance.

Undo
  • 25,519
  • 37
  • 106
  • 129
user2764080
  • 1
  • 1
  • 4

2 Answers2

1

the number of cores means how many processors are implemented in each machine within a given cluster. Moreover, each core can run a mapper.

You don't have to worry about the creation of the mapper because the hadoop framework will do it for you.

Dhoha
  • 369
  • 3
  • 6
  • 17
0

That's a really good question. My guess is that what's happening is that EMR is smart enough to setup the Master node to run the MapReduce jobs in the event that there are no Core or Task nodes. That's a guess.

If you want to find out if I'm right, spin up a cluster. Then start a MapReduce job, while keeping an eye on the java processes via jps -lm and see if any mapper processes get launched on the Master node.

nelsonda
  • 1,170
  • 1
  • 10
  • 21