0

I have a multinode cluster in Hadoop consisting of two machines(one name node and two data nodes in each machine).

I am using:

hadoop fs -put dir1 hdfspath

In the above command: will the data be distributed in both machines or only in one machine?

What is the way to balance : is it by using the hadoop balancer tool or there can be an automatic way for this?

Mosab Shaheen
  • 1,114
  • 10
  • 25

1 Answers1

0

It will depend on two factors:

  • Size of data you are storing
  • Blocks size defined for your data nodes.

If the size of data being stored is more than block size, data will be split into block size and stored in different data nodes.

techprat
  • 375
  • 7
  • 23