0

I have a simple java client that saves files to HDFS - configured with 1 name node. For this, I use a hadoop configuration, specifing the default filesystem like:

org.apache.hadoop.conf.Configuration conf = new org.apache.hadoop.conf.Configuration();
conf.set("fs.defaultFS", "hdfs://NNip:port");

However, in the future, I will need to connect to a hdfs configured with 1 active namenode and 1 standby namenode and in case the active namenode goes down, automatically use the standby namenode.

Does anyone have any advice on how this could be achieved? Any link / example would be much appreciated, as I am still new to anything related to the Hadoop platform.

Thanks

Asleep
  • 5
  • 4

1 Answers1

1

The Configuration variable will by default read an hdfs-site.xml file in your classpath.

Ideally, you should have this file with your Java application, or otherwise define a HADOOP_CONF_DIR environment variable in the OS. This is how the hdfs CLI tools are work, for example, which just forward to Java classes.

Then, if your cluster is using Namenode HA, it should already know what the value of fs.defaultFS is set to, so you don't need to set that yourself.

If you wanted to do it programmatically, you need to configure Zookeeper for the namenodes and a "nameservice" for HDFS, which properties can be found in that XML file

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245