Here is the simple experiment I used to demonstrate the problem, written in scala, with Hadoop 2.7.7 as the only dependency:
import org.apache.hadoop.security.UserGroupInformation
import scala.concurrent.duration.{Duration, MINUTES}
import scala.concurrent.{Await, ExecutionContext, Future}
object UGITest {
val ugi = UserGroupInformation.getCurrentUser
val credential = ugi.getCredentials
val ff = Future {
val _ugi = UserGroupInformation.getCurrentUser
val _credential = _ugi.getCredentials
require(ugi == _ugi, s"UGI is lost, before: $ugi, now ${_ugi}")
require(credential == _credential, s"credential is lost, before: $credential, now ${_credential}")
}(ExecutionContext.global)
Await.result(ff, Duration.apply(1, MINUTES))
}
The first requirement ugi==_ugi
passed successfully, indicating that closure of the Future
was successfully launched in a child thread.
The second requirement credential==_credential
fail with the following information:
java.lang.IllegalArgumentException: requirement failed: credential is lost, before: org.apache.hadoop.security.Credentials@cb6e68f, now org.apache.hadoop.security.Credentials@6b746674
at scala.Predef$.require(Predef.scala:281)
at ...
It appears that the same UserGroupInformation is used, but all credentials are lost. What's the purpose of this design?
The above experiment was just executed on a single computer not in any cluster. I have't tested it with any hadoop authentication framework (e.g. kerberos) enabled. But I think the result will be more or less the same.