I have a Spring boot application which is accessing HDFS through Webhdfs secured via Apache Knox secured by Kerberos. I created my own KnoxWebHdfsFileSystem
with custom scheme (swebhdfsknox) as a subclass of WebHdfsFilesystem
which only changes the URLs to contain the Knox proxy prefix. So it effectively remaps requests from form:
http://host:port/webhdfs/v1/...
to the Knox one:
http://host:port/gateway/default/webhdfs/v1/...
I do this by overriding two methods:
public URI getUri()
URL toUrl(Op op, Path fspath, Param<?, ?>... parameters)
So far so good. I let spring boot create FsShell
for me and use it for various operations such as list files, mkdir etc. All work fine. Except copyFromLocal which as documented requires 2 steps and redirect. And on the last step when the filesystem tries to PUT
to the final URL which received in Location header it fails with error:
org.apache.hadoop.security.AccessControlException: Authentication required
at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:334) ~[hadoop-hdfs-2.6.0.jar:na]
at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91) ~[hadoop-hdfs-2.6.0.jar:na]
at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$FsPathOutputStreamRunner$1.close(WebHdfsFileSystem.java:787) ~[hadoop-hdfs-2.6.0.jar:na]
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:54) ~[hadoop-common-2.6.0.jar:na]
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112) ~[hadoop-common-2.6.0.jar:na]
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:366) ~[hadoop-common-2.6.0.jar:na]
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:338) ~[hadoop-common-2.6.0.jar:na]
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:302) ~[hadoop-common-2.6.0.jar:na]
at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1889) ~[hadoop-common-2.6.0.jar:na]
at org.springframework.data.hadoop.fs.FsShell.copyFromLocal(FsShell.java:265) ~[spring-data-hadoop-core-2.2.0.RELEASE.jar:2.2.0.RELEASE]
at org.springframework.data.hadoop.fs.FsShell.copyFromLocal(FsShell.java:254) ~[spring-data-hadoop-core-2.2.0.RELEASE.jar:2.2.0.RELEASE]
I suspect the problem is the redirect somehow but can't figure out what might be the problem here. If I do the same requests via curl the file is successfully uploaded to HDFS.