I am trying to connect to secured phoenix through spark in yarn using JDBC, and i can see on the logs, it is connecting successfully:
JDBC URL: jdbc:phoenix:zookeeper_quorum:/hbase-secure:someprincial@REALM.COM:/path/to/keytab/someprincipal.keytab
18/02/27 09:30:22 INFO ConnectionQueryServicesImpl: Trying to connect to a secure cluster with keytab:/path/to/keytab/someprincipal.keytab
18/02/27 09:30:22 DEBUG UserGroupInformation: hadoop login
18/02/27 09:30:22 DEBUG UserGroupInformation: hadoop login commit
18/02/27 09:30:22 DEBUG UserGroupInformation: using kerberos user:someprincial@REALM.COM
18/02/27 09:30:22 DEBUG UserGroupInformation: Using user: "someprincial@REALM.COM" with name someprincial@REALM.COM
18/02/27 09:30:22 DEBUG UserGroupInformation: User entry: "someprincial@REALM.COM"
18/02/27 09:30:22 INFO UserGroupInformation: Login successful for user someprincial@REALM.COM using keytab file /path/to/keytab/someprincipal.keytab
18/02/27 09:30:22 INFO ConnectionQueryServicesImpl: Successfull login to secure cluster!!
but then later, when trying to call AbstractRpcClient, it is giving me an issue and it is not anymore using KERBEROS authentication in UserGroupInformation, and it seems it is getting the OS user instead of the one i provided in the JDBC
18/02/27 09:30:23 DEBUG AbstractRpcClient: RPC Server Kerberos principal name for service=ClientService is hbase/hbaseprincipa@REALM.COM
18/02/27 09:30:23 DEBUG AbstractRpcClient: Use KERBEROS authentication for service ClientService, sasl=true
18/02/27 09:30:23 DEBUG AbstractRpcClient: Connecting to some.host.name/10.000.145.544:16020
18/02/27 09:30:23 DEBUG UserGroupInformation: PrivilegedAction as:someuser (auth:SIMPLE) from:org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:734)
18/02/27 09:30:23 DEBUG HBaseSaslRpcClient: Creating SASL GSSAPI client. Server's Kerberos principal name is hbase/hbaseprincipa@REALM.COM
18/02/27 09:30:23 DEBUG UserGroupInformation: PrivilegedActionException as:someuser (auth:SIMPLE) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
18/02/27 09:30:23 DEBUG UserGroupInformation: PrivilegedAction as:someuser (auth:SIMPLE) from:org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.handleSaslConnectionFailure(RpcClientImpl.java:637)
18/02/27 09:30:23 WARN AbstractRpcClient: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
18/02/27 09:30:23 ERROR AbstractRpcClient: SASL authentication failed. The most likely cause is missing or invalid credentials. Consider 'kinit'.
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
at org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:179)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(RpcClientImpl.java:611)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$600(RpcClientImpl.java:156)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:737)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:734)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:734)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:887)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:856)
at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1199)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:213)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:287)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:32741)
at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:379)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:201)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:63)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:364)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:338)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
at org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:65)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122)
at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
... 26 more
This issue only happens when i am running in yarn, but when i am running in my local, it is using the same UserGroupInformation and able to connect to ClientService without any issues.
Do you have any idea why is this happening?
I already included in my classpath (executor) all the needed configurations like hbase-site.xml, core-site.xml, hdfs-site.xml, I also set the JAAS config file.
I've noticed on my local that in the beginning, the UGI gets the one from my OS, then since i tried to connect to phoenix, phoenix (ConnectionQueryServicesImpl.java) overrides the UGI with the one I indicated in JDBC, so when trying to connect again, it is using the correct UGI.
When running in cluster, it seems that it is not like that, even though I connected to phoenix successfully, when trying to use the UGI again, it gets the one from OS - i am running in same executor.
notice that the RpcClientImpl is using CurrentUser which is based from the OS user.
In my driver, whenever i try to get the CurrentUser, it is using kerberos authentication with the principal - assuming that kinit is done or keytab & principal is provided in the spark submit command
In executor, when there is a valid token in the node, LoginUser is set to kerberos authentication but CurrentUser is set to simple authentication using OS information
How can i make the executor change the CurrentUser?
Anyway, i am able to solve it by forcingly doing the update using the LoginUser with UserGroupInformation.doAs() method