3

iam using hadoop 2.7.1 on centos 7

when high availability is included with hadoop cluster

and active name node fails ,it becomes stand by

but webhdfs doesn't support high availability ?isn't it

what should be the alternative to send get and put request to other active name

node with the failure of master name node

franklinsijo
  • 17,784
  • 4
  • 45
  • 63
oula alshiekh
  • 843
  • 5
  • 14
  • 40

2 Answers2

4

Yes, WebHDFS is not High Availability aware. This issue is still open. Refer HDFS-6371

Instead, you can opt for HttpFs. It is inteoperable with the webhdfs REST API and HA aware.

Or, write your custom implementation to redirect requests to the Active Namenode.

franklinsijo
  • 17,784
  • 4
  • 45
  • 63
  • even though we redirect our requests to currently active name node(which becomes active after main name node failure) that will not success when our request is put , because the data node will redirect this request to base main name node which currently fails ,isn't this true – oula alshiekh Apr 09 '17 at 07:50
  • can you suggest some useful links and videos for httpfs handling ,if there are any – oula alshiekh Apr 09 '17 at 07:55
  • when i'am issuing the following command curl -i -L "http://192.168.4.128:50070/webhdfs/v1/aloosh /a1.tbl/?user.name=root&op=OPEN" on putty terminal i get no response and after some time i get empty reply from server message even though issuing this url from a browser works fine any idea? – oula alshiekh Apr 09 '17 at 08:22
  • 1
    No, the datanodes are configured with the information of both the namenodes and are designed to send block reports to both the namenodes. It is not the case for client requests, webhdfs do not handle it. Thus a suitable alternative, which is aware of HDFS HA, like HttpFs is required. As for the tutorial, HttpFs and WebHDFS work similarly. Use [this](https://hadoop.apache.org/docs/current/hadoop-hdfs-httpfs/ServerSetup.html) for setup and [this](https://hadoop.apache.org/docs/current/hadoop-hdfs-httpfs/UsingHttpTools.html) for getting started. – franklinsijo Apr 09 '17 at 13:12
1

Webhdfs server runs in the same process as NameNode. So you need to run webhdfs compatible proxy server, that would get rid of NN failover:

  1. HttpFs - as part of Hadoop
  2. Apache Knox- as part of HDP distribution.

They both webhdfs compatible, so you don't need to change any REST API.

prudenko
  • 1,663
  • 13
  • 19