3

I have a Spring Boot application that uses spring-yarn-boot:2.2.0.RELEASE to get access to a Hadoop filesystem (HDFS). Operations that I do are LISTSTATUS, GETFILESTATUS and OPEN (to read a file). HDFS URI is specified through application.properties:

spring.hadoop.fsUri=webhdfs://127.0.0.1:50070/webhdfs/v1/

I make a bean to which I provide Hadoop Configuration (that Spring somehow automagically prepares for me on startup):

SimplerFileSystem fs = new SimplerFileSystem(FileSystem.get(configuration));
FsShell shell = new FsShell(configuration);

And everything works well as expected, but the problems came when I got two new requirements.

First thing is that HDFS will be protected with SSL from now on. I can't seem to find any way to tell my application that the fsURI that starts with webhdfs:// is actually a https connection. And if I will give the https URL directly, I'll get an exception:

java.io.IOException: No FileSystem for scheme: https
    at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2584)

... which is caused by that code: FileSystem.get(configuration).

This thing is driving me crazy, I don't seem to find a way to get pass this.

Second requirement is, that I need to authenticate myself against the WebHDFS with basic authentication. For this I also can't find any means in the client API.

Has anyone done it before and have any instructions to share? Or maybe anyone knows a different client API that I can use to accomplish this?

One option is to implement the REST calls myself with RestTemplate or any other REST service consumer API, but this looks like not-so-special use case so I'm really hoping that there is something that has been done already.

EDIT:

Found a solution to the HTTPS problem. One should use swebhdfs:// as url prefix and everything will work. Still havent found a solution to the Basic Auth problem.

Tarmo
  • 3,851
  • 2
  • 24
  • 41
  • 1
    Just for information to readers of this question. Since I did not find any way to use Basic Authentication with Hadoop API, I implemented the few interactions that I had with HDFS, using Apache HTTP Client and Spring's Rest Template. – Tarmo Nov 25 '15 at 15:37

0 Answers0