I can access hdfs in terminal via hdfs dfs -ls /
and I get the address and port of the cluster by hdfs getconf -confKey fs.defaultFS
(I refer to address and port in code below).
Trying to read files on hdfs in java gave me similar errors as decribed here (also discussed in this questions). With the address I try the following in java
FileSystem fs;
BufferedReader br;
String line;
Path path = new Path("hdfs://<address>:<port>/somedata.txt");
try
{
/* --------------------------
* Option 1: Gave 'Wrong FS: hdfs://..., Expected file:///' error
Configuration configuration = new Configuration();
configuration.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
configuration.addResource(new Path("/etc/hadoop/conf/hdfs-site.xml"));
fs = FileSystem.get(configuration);
* ---------------------------
*/
// --------------------------
// Option 2: Gives error stated below
Configuration configuration = new Configuration();
fs = FileSystem.get(new URI("hdfs://<address>:<port>"),configuration);
// --------------------------
LOG.info(fs.getConf().toString());
FSDataInputStream fsDataInputStream = fs.open(path);
InputStreamReader inputStreamReader = new InputStreamReader(fsDataInputStream);
BufferedReader bufferedReader = new BufferedReader(inputStreamReader);
while((line=bufferedReader.readLine())!=null){
// some file processing code here.
}
bufferedReader .close();
}
catch (Exception e)
{
fail();
}
The error that option 2 gives me is
java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(Ljava/lang/String;)Ljava/net/InetSocketAddress;
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:99)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1446)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:67)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1464)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:263)
at fwt.gateway.Test_Runner.checkLocationMasterindicesOnHDFS(Test_Runner.java:76)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:86)
at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:678)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)
The fact that I can access the files from terminal is an indication for me that core-site.xml
and hdfs-site.xml
must be correct.
Thanks for the help!
EDIT 1: The maven dependencies I use for the code below are the following
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>3.0.0-alpha4</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>1.2.1</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>3.0.0-alpha4</version>
</dependency>