0

I am trying to learn YARN. But I have hit a roadblock. I have some questions.

  1. For every application, the data nodes must have a container each. But, are these containers created on their own, while running an application or do we need to create them?

  2. In dockers, you can access the containers, which behave as separate machines themselves. Can we access the containers similar way?

  3. In dockers, a container cannot see outside the container and behaves as a system of its own due to which it has container process id and from the host machine it has a process id as well. In other words, the containers are isolated from the other processes. Is there a similar concept in YARN as well?

Thanks in Advance!!! :)

RV186
  • 303
  • 2
  • 3
  • 12

1 Answers1

0

YARN is not what you think it is. It is not for launching docker containers. YARN is for launching distributed applications (Spark, MapReduce, etc...).

  1. You can't "install" an app in YARN. You can "run" an app in YARN.

  2. The container is a YARN abstraction to specify that every process of a distributed application will execute with limited resources assigned to it by YARN. You can't access the container, since it's just a java process.

  3. As I've mentioned before, a container in YARN is a normal linux process. You'd be able to see its pid by executing something like "ps".

facha
  • 11,862
  • 14
  • 59
  • 82
  • I think I asked the wrong question conceptually, sorry about that. I have re framed the question description to make it clearer. Thanks! – RV186 Jan 16 '17 at 07:55