-1

I can access hdfs in terminal via hdfs dfs -ls / and I get the address and port of the cluster by hdfs getconf -confKey fs.defaultFS (I refer to address and port in code below).

Trying to read files on hdfs in java gave me similar errors as decribed here (also discussed in this questions). With the address I try the following in java

        FileSystem fs;
        BufferedReader br;
        String line;
        Path path = new Path("hdfs://<address>:<port>/somedata.txt");
        try 
        {
            /* --------------------------
             * Option 1: Gave 'Wrong FS: hdfs://..., Expected file:///' error
            Configuration configuration = new Configuration();
            configuration.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
            configuration.addResource(new Path("/etc/hadoop/conf/hdfs-site.xml"));
            fs = FileSystem.get(configuration);
            * ---------------------------
            */

            // --------------------------
            // Option 2: Gives error stated below
            Configuration configuration = new Configuration();
            fs = FileSystem.get(new URI("hdfs://<address>:<port>"),configuration);
            // --------------------------

            LOG.info(fs.getConf().toString());

            FSDataInputStream fsDataInputStream = fs.open(path);
            InputStreamReader inputStreamReader = new InputStreamReader(fsDataInputStream);
            BufferedReader bufferedReader = new BufferedReader(inputStreamReader);
            while((line=bufferedReader.readLine())!=null){
            // some file processing code here.
            }
            bufferedReader .close();
        } 
        catch (Exception e) 
        {
            fail();
        }

The error that option 2 gives me is

java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(Ljava/lang/String;)Ljava/net/InetSocketAddress;
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:99)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1446)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:67)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1464)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:263)
at fwt.gateway.Test_Runner.checkLocationMasterindicesOnHDFS(Test_Runner.java:76)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:86)
at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:678)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)

The fact that I can access the files from terminal is an indication for me that core-site.xml and hdfs-site.xml must be correct.

Thanks for the help!

EDIT 1: The maven dependencies I use for the code below are the following

   <dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-hdfs</artifactId>
    <version>3.0.0-alpha4</version>
   </dependency>

   <dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-core</artifactId>
    <version>1.2.1</version>
   </dependency>

   <dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-client</artifactId>
    <version>3.0.0-alpha4</version>
   </dependency>
tenticon
  • 2,639
  • 4
  • 32
  • 76
  • Looks like you might be missing some dependencies. If you using maven or similar, can you share the hadoop imports you are doing? – StefanE Aug 02 '17 at 12:33
  • thanks, is updated. why downvoted? – tenticon Aug 02 '17 at 12:36
  • Probably because you don't know what a "NoSuchMethodDefError" is ... and that it typically points at a **versioning** conflict of some sort. Library A wants to call a method from library B ... but that method in B isnt there (any more). in that sense: did you do so some prior research, like searching for that exception message? – GhostCat Aug 02 '17 at 12:41
  • As @GhostCat mentioned, your pom shows a conflict of versions between hadoop-core and other dependencies. Should be `hadoop-common` and `3.0.0-alpha4`, I think. – philantrovert Aug 02 '17 at 12:46
  • This is definitely some version mismatch in dependency. Please check your build classpath. – Pradatta Aug 02 '17 at 12:46

2 Answers2

0

Update your POM to following:

<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-client</artifactId>
    <version>2.8.1</version>
</dependency>
<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-core</artifactId>
    <version>2.6.0-mr1-cdh5.4.2.1</version>
    <type>pom</type>
</dependency>
<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-hdfs</artifactId>
    <version>2.8.1</version>
</dependency>

Never use alpha versions as they are likely to have a bugs.

StefanE
  • 7,578
  • 10
  • 48
  • 75
0

You can use this in pom.xml file

    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-common</artifactId>
        <version>2.6.0</version>
    </dependency>

I have used version 2.6.0. You can try any updated version.

Avijit
  • 1,770
  • 5
  • 16
  • 34