We have a requirement where in we plan to use sparklyr to execute model code written in R over spark. The spark cluster we use is a kerborised cluster. We are able to connect to this cluster and execute our code using a keytab. The challenge we have now is that we need to use a username password to establish connection instead of using keytab. Any help/pointers on how this could be accomplished would be greatly appreciated.
Asked
Active
Viewed 403 times
0
-
If you are using RStudio, you can click on `Tools -> Shell` and get a Kerberos ticket using `kinit username@domain`. After that, you can access to a Kerberized cluster using `sparklyr` functions. – Jaime Caffarel Jul 19 '17 at 10:06
-
Thanks Jaime for the answer.I believe for me to do a Kinit , I would need to have kerberos installed on my laptop. The RStudio we use is RStudio desktop on widows. We are attempting to connect to a remote spark cluster using yarn client. – Asish Balakrishnan Jul 19 '17 at 13:28
-
Another option would be to install RStudio Server on one of the cluster nodes, and execute the `kinit` command from there. We have a similar configuration with a yarn cluster and it works. – Jaime Caffarel Jul 19 '17 at 17:55
-
@Jaime - R studio server is currently off the table. I was able to source the kerberos client binaries (MIT Kerberos for Windows 3.2.2) from http://web.mit.edu/kerberos/dist/. I configured leash32 with the krb.conf to point to the remote krb server. I was able to kinit and get the ticket, but it appears the ticket is not getting passed along when I call sparklyr/rhdfs/sparkR. I see a login failed error on the screen. – Asish Balakrishnan Jul 20 '17 at 15:17
-
I'm not sure if that can be done with RStudio Desktop. Maybe asking in the RStudio Support (https://support.rstudio.com/hc/en-us) could confirm you if this is possible. – Jaime Caffarel Jul 21 '17 at 07:05