1

I am submitting a spark with that would write to Kerborized cluster with following command. I didn't add any code in the spark program to enable authentication etc stuff. I just passed principal and keytab with spark-submit. But i am getting 'Failed to renew token' error. My spark program could connect to hive metastore.

Can i know what is causing this?

> ./spark-submit --class com.abcd.xyz.voice.cc.cc.cc --verbose --master
> yarn --deploy-mode cluster --executor-cores 6 --executor-memory 6g
> --driver-java-options "-Dlog4j.configuration=file:/app/home/abcd/conf/my_Driver.log4j" 
> --principal myowner@CABLE.abcd.COM --keytab /app/home/emm/myfile.keytab --conf
> "spark.executor.extraJavaOptions=-Dlog4j.configuration=file:/app/home/emm/conf/my_Executor.log4j
> -Ddm.logging.name=LegalDemand" /app/home/emm/bin/myjar.jar --files file:///app/home/emm/mykeytab.keytab --conf
> spark.hadoop.fs.hdfs.impl.disable.cache=true
> /app/home/emm/conf/my.properties

17/07/11 18:02:42 INFO yarn.Client: client token: N/A diagnostics: Failed to renew token: Kind: TIMELINE_DELEGATION_TOKEN, Service: 172.27.30.133:8188, Ident: (owner=myowner, renewer=yarn, realUser=, issueDate=1499796160528, maxDate=1500400960528, sequenceNumber=74505, masterKeyId=294) ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1499796160725 final status: FAILED tracking URL: http://abcd.net:8088/proxy/application_1499697586013_1727/ user: myowner

AKC
  • 953
  • 4
  • 17
  • 46
  • Can we know *(a)* the version of Spark you are using, *(b)* how long the job could run before that error, and *(c)* whether there are interesting error messages in the YARN log of the AppMaster for `application_1499697586013_1727`?? – Samson Scharfrichter Jul 11 '17 at 21:49
  • Version is 2.0.2(Apache Spark). – AKC Jul 11 '17 at 21:50
  • Few things i found are 1) 17/07/11 20:11:15 INFO hive.metastore: Connected to metastore. 2) 17/07/11 20:13:25 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(emm, myuser); groups with view permissions: Set(); users with modify permissions: Set(emm, myuser); groups with modify permissions: Set() – AKC Jul 11 '17 at 21:51
  • I got this message as well. 17/07/11 20:10:50 INFO security.UserGroupInformation: Login successful for user myuser@CABLE.abcd.COM using keytab file /app/home/emm/myuser.keytab – AKC Jul 11 '17 at 21:52
  • This above ones are from spark logs. Yarn application was not logged. i am getting..Failed to read the attempts of the application application_1499697586013_1867. – AKC Jul 11 '17 at 22:00
  • Hmm... `yarn-cluster` with Kerberos is tricky; you could raise the log level to DEBUG and enable some badly documented Kerberos trace flags (in badly documented Spark env variables for the launcher, plus in better documented properties for driver & executor), then dig into the Spark source code and make use of your many years of experience with Kerberos + Hadoop... or maybe you should just give up and switch to `yarn-client`. – Samson Scharfrichter Jul 11 '17 at 22:12
  • i wil try that. I can see the messge that i got connected to Hive metastore. Does it mean that keberors authentication is sucessful initially? – AKC Jul 11 '17 at 22:36
  • There's initial Kerberos auth, then request for multiple Krb service tickets for `service/hostname` (w/ strict DNS checks), then request for multiple Hadoop delegation tokens (because Krb was not designed for distributed systems). That error message is a misleading catch-all thing that was already mentioned in several JIRAs, about Krb and/or networking conf and /or Spark bugs. – Samson Scharfrichter Jul 12 '17 at 07:28
  • I am going to verify through logs now. Thank you for the information. – AKC Jul 12 '17 at 17:22
  • Found this on yarn logs:Exception in thread "main" org.apache.hadoop.service.ServiceStateException: java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "ebdp-ch2-c007s.sys.abcd.net/172.27.66.10"; destination host is: "ebdp-ch2-k008p.sys.abcd.net":8020; – AKC Jul 12 '17 at 18:07
  • From "Hadoop and Kerberos: The Madness beyond the Gate" (https://steveloughran.gitbooks.io/kerberos_and_hadoop/content/sections/errors.html) _Security error messages appear to take pride in providing limited information... "Failed to find any Kerberos tgt" -- It's very common, and essentially means "you weren't authenticated"_ >> go to section "Low-level secrets" to know more about the trace flags. Also, consider where the error was reported -- in the launcher, in the driver (i.e. in the YARN AppMaster in cluster mode), in one of the executors? – Samson Scharfrichter Jul 12 '17 at 21:38

0 Answers0