2

I have a clustered Named Node Setup. The Named nodes are configured to be Active and Passive.

When I make a WEBHDFS call, the URL to be provided is

http://:/webhdfs/v1/

Since I have 2 Named nodes available, I have 2 URL's available

http://:/webhdfs/v1/ - Its active now http://:/webhdfs/v1/ - its passive now

My question is : The named nodes can failover any time. What value do I provide in HOST? Should I give the Service name? Is there a virtual IP that is normally configured in HDP platform which takes care of the redirection?

Or should I place a load balancer or gateway in front of the Named Nodes so that the failover is handled without any impact to the calling application.

Bhanu Kaushik
  • 876
  • 6
  • 25
Vaya
  • 560
  • 6
  • 20
  • This is rather a workaround than an answer, so I'm placing it here as a comment. You could try httpfs service that is able to switch between active/passive namenodes. – facha Oct 02 '15 at 08:43

3 Answers3

0

It's a bug, it doesn't work in HA mode.

You have to explicitly put the active NN URL every time NN changes it's state.

https://hortonworks.jira.com/browse/BUG-30030

dreamer
  • 1,039
  • 2
  • 16
  • 36
0

You will get an exception if you're talking to an inactive namenode.

See my answer here Any command to get active namenode for nameservice in hadoop?

Community
  • 1
  • 1
Erik Forsberg
  • 4,819
  • 3
  • 27
  • 31
0

You must determine the active Namenode first, then issue your WebHDFS API request to the active namenode. Issuing WebHDFS API requests to a standby namenode will result in an HTTP 403 error.

There is no automatic way to determine the active Namenode when using WebHDFS yet. You can use the hdfs command line client to query the configuration, or alternatively, loop through the Namenodes and issue JMX API requests to the `/jmx?qry=Hadoop:service=NameNode,name=NameNodeStatus" endpoint and parse the output.

dschiavu
  • 11
  • 2