[Important notice] this answer applies to a plain Hadoop cluster using a Linux KDC (typically MIT Kerberos). For a Cloudera cluster relying on Microsoft Active Directory KDC, any .Net HTTP connector can achieve SPNEGO using Microsoft SSPI protocol (sooo boring...)
~~~~
The only way I know to access WebHDFS from the Microsoft world is an ugly and complex workaround:
- install MIT Kerberos for Windows utility on the machine that will
actually connect to HDFS, plus the appropriate Kerberos5 config file
- make sure that your JVM has the "unlimited strength cryptography"
security policy installed (separate download, duh)
- develop a small Java utility that connects to WebHDFS service (on
the NameNode) using SPNEGO with a GSSAPI Kerberos ticket
Option 1: create the ticket thru GUI, and tell Java to fetch it in the default cache
Option 2: tell Java to create its own ticket automatically, using a keytab file (must be created on Linux with ktutil
; no such utility in the Windows package), and ignore the cache
- make your Java code run a single GET, to retrieve a HDFS delegation
token for this session, then dump the token to StdOut, then exit
- make your .Net code run the Java utility, capture StdOut, and
retrieve the token
- connect to WebHDFS (NameNode + eventual redirects to the DataNodes) without SPNEGO,
but inserting the token on the URL as a proof of pre-authentication
So in the end it's a Java problem. And setting up a working Kerberos config is incredibly tricky (cf. "Madness beyond the Gate", the current reference site about Kerberos implementation issues in the Hadoop ecosystem)