How to split the network load in hadoop HDFS

Question

I have 2 servers hadoop one is the namenode and the other the secondary namenode. both are datanode and currently when I want to read a file using the port of the namenode 8020 it works, but all the network load goes to that node, is there no way to divide the network load to take advantage of both servers?

I appreciate your help

Ideally, you do not not put a datanode on a namenode. And all access requests must go through the namenode anyway. A secondary namenode is not actually "a namenode", by the way — OneCricketeer, Mar 08 '18 at 01:16

score 1 · Answer 1 · answered Mar 07 '18 at 19:59

For your situation, you can't do anything. Namenode HA exists but it more of active/standby rather than distributed. The closest thing to what you want is called federation but this is more for the case of 10K nodes not 2 nodes.

You can read more about those here:

Ben Watson · Answer 2 · 2018-03-08T10:09:34.907

1

A few things here that could help:

It's never recommended to have datanodes on the same nodes as the namenodes.
If your file is stored on the datanode that is also the primary namenode, all network traffic will be to that node. You're asking that node (as the namenode) to find the file, and it is then being returned from the same node (as the datanode).

This problem will go away if you get more servers.

edited Mar 08 '18 at 10:09

answered Mar 08 '18 at 08:54

Ben Watson

5,357
4
42
65

How to split the network load in hadoop HDFS

2 Answers2