2

I am trying to connect to secured phoenix through spark in yarn using JDBC, and i can see on the logs, it is connecting successfully:

JDBC URL: jdbc:phoenix:zookeeper_quorum:/hbase-secure:someprincial@REALM.COM:/path/to/keytab/someprincipal.keytab

18/02/27 09:30:22 INFO ConnectionQueryServicesImpl: Trying to connect to a secure cluster with keytab:/path/to/keytab/someprincipal.keytab
18/02/27 09:30:22 DEBUG UserGroupInformation: hadoop login
18/02/27 09:30:22 DEBUG UserGroupInformation: hadoop login commit
18/02/27 09:30:22 DEBUG UserGroupInformation: using kerberos user:someprincial@REALM.COM
18/02/27 09:30:22 DEBUG UserGroupInformation: Using user: "someprincial@REALM.COM" with name someprincial@REALM.COM
18/02/27 09:30:22 DEBUG UserGroupInformation: User entry: "someprincial@REALM.COM"
18/02/27 09:30:22 INFO UserGroupInformation: Login successful for user someprincial@REALM.COM using keytab file /path/to/keytab/someprincipal.keytab
18/02/27 09:30:22 INFO ConnectionQueryServicesImpl: Successfull login to secure cluster!!

but then later, when trying to call AbstractRpcClient, it is giving me an issue and it is not anymore using KERBEROS authentication in UserGroupInformation, and it seems it is getting the OS user instead of the one i provided in the JDBC

18/02/27 09:30:23 DEBUG AbstractRpcClient: RPC Server Kerberos principal name for service=ClientService is hbase/hbaseprincipa@REALM.COM
18/02/27 09:30:23 DEBUG AbstractRpcClient: Use KERBEROS authentication for service ClientService, sasl=true
18/02/27 09:30:23 DEBUG AbstractRpcClient: Connecting to some.host.name/10.000.145.544:16020
18/02/27 09:30:23 DEBUG UserGroupInformation: PrivilegedAction as:someuser (auth:SIMPLE) from:org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:734)
18/02/27 09:30:23 DEBUG HBaseSaslRpcClient: Creating SASL GSSAPI client. Server's Kerberos principal name is hbase/hbaseprincipa@REALM.COM
18/02/27 09:30:23 DEBUG UserGroupInformation: PrivilegedActionException as:someuser (auth:SIMPLE) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
18/02/27 09:30:23 DEBUG UserGroupInformation: PrivilegedAction as:someuser (auth:SIMPLE) from:org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.handleSaslConnectionFailure(RpcClientImpl.java:637)
18/02/27 09:30:23 WARN AbstractRpcClient: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
18/02/27 09:30:23 ERROR AbstractRpcClient: SASL authentication failed. The most likely cause is missing or invalid credentials. Consider 'kinit'.
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
    at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
    at org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:179)
    at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(RpcClientImpl.java:611)
    at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$600(RpcClientImpl.java:156)
    at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:737)
    at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:734)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
    at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:734)
    at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:887)
    at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:856)
    at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1199)
    at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:213)
    at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:287)
    at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:32741)
    at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:379)
    at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:201)
    at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:63)
    at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
    at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:364)
    at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:338)
    at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
    at org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:65)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
    at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
    at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122)
    at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
    at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224)
    at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
    at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
    at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
    ... 26 more

This issue only happens when i am running in yarn, but when i am running in my local, it is using the same UserGroupInformation and able to connect to ClientService without any issues.

Do you have any idea why is this happening?

I already included in my classpath (executor) all the needed configurations like hbase-site.xml, core-site.xml, hdfs-site.xml, I also set the JAAS config file.

I've noticed on my local that in the beginning, the UGI gets the one from my OS, then since i tried to connect to phoenix, phoenix (ConnectionQueryServicesImpl.java) overrides the UGI with the one I indicated in JDBC, so when trying to connect again, it is using the correct UGI.

When running in cluster, it seems that it is not like that, even though I connected to phoenix successfully, when trying to use the UGI again, it gets the one from OS - i am running in same executor.


notice that the RpcClientImpl is using CurrentUser which is based from the OS user.

In my driver, whenever i try to get the CurrentUser, it is using kerberos authentication with the principal - assuming that kinit is done or keytab & principal is provided in the spark submit command

In executor, when there is a valid token in the node, LoginUser is set to kerberos authentication but CurrentUser is set to simple authentication using OS information

How can i make the executor change the CurrentUser?

Anyway, i am able to solve it by forcingly doing the update using the LoginUser with UserGroupInformation.doAs() method

Azel
  • 345
  • 4
  • 15
  • To whoever downvote it, please provide a reason why so that I can improve my question. thank you – Azel Mar 02 '18 at 14:54
  • Did you get any solution for this? – Anup Ghosh May 12 '19 at 00:48
  • Hi Anup, unfortunately, i didn’t get any solution to this. i had a work around (listed above) and also altenatively, we’ve found out about spark HBASE delegation token to make spark handles the kerberos. – Azel May 13 '19 at 04:13

1 Answers1

0

After several weeks, I finally fingered it out. The key is setting spark.yarn.security.credentials.hbase.enabled to true.

Submit spark as following:

spark-submit \
  --master yarn \
  --keytab my-keytab \
  --principal my-principal \
  --conf spark.yarn.security.credentials.hbase.enabled=true \
  # other configs

And in executor, create phoenix connection without keytab and principal:

String url = "jdbc:phoenix:2.1.8.1:2181:/hbase";
Connection conn = DriverManager.getConnection(url, properties);
Peter Zhao
  • 7,456
  • 3
  • 21
  • 22