0

I developed a Java program that runs on a Hadoop Edge Node (Cloudera CDH 5.9.1). It runs "for ever" as it is a (micro)service replying to requests. It receives requests from a webapp running somewhere else and extract data from Impala using Impala JDBC driver and send back data.

It runs well during roughly 10 hours, then it becomes unable to get Impala JDBC connections. You will find stacktrace and the end of this question. In short, it throws javax.security.auth.login.LoginException: Unable to obtain Principal Name for authentication.

My understanding is that Impala JDBC driver becomes enable to authenticate against Kerberos because the Kerberos ticket expired. So to be sure that the user with who the program is launched has a Kerberos ticket, I added a piece of code before the JDBC connection creation. I read those 2 Stackoverflow threads :

They explained very well the issue and the solution. Here is my attempt to implement that (I don't understand why it fails for Impala) :

public class KerberosRelogger {

private final static Logger logger = LogManager.getLogger();

public static String userName;

public static String user;

public static String keytab;

public static Configuration conf;

private static UserGroupInformation ugi;

static {
    try {
        init();
    } catch (IOException e) {
        throw new RuntimeException(e);
    }
}

public static void init() throws IOException {
    userName = System.getProperty("user.name");
    user = userName + "@MY_REALM";
    keytab = System.getProperty("user.home") + "/" + userName + ".keytab";

    conf = new Configuration();
    conf.addResource(new Path("file:///etc/hadoop/conf/core-site.xml"));
    conf.addResource(new Path("file:///etc/hadoop/conf/hdfs-site.xml"));
    UserGroupInformation.setConfiguration(conf);
    ugi = UserGroupInformation.loginUserFromKeytabAndReturnUGI(user, keytab);

    logger.debug("userName : {}", userName);
    logger.debug("user : {}", user);
    logger.debug("keytab : {}", keytab);
}

public static void testHadoopConnexion() throws IOException {
    ugi.checkTGTAndReloginFromKeytab();
    FileSystem fs = FileSystem.get(conf);
    FileStatus[] statuses = fs.listStatus(new Path("/"));
    List<String> dirNames = new ArrayList<>();
    for (FileStatus status : statuses) {
        dirNames.add(status.getPath().getName());
    }
    logger.debug("HDFS root dirs : " + dirNames);
}

private static Connection createImpalaConnection() throws IOException {
    ugi.checkTGTAndReloginFromKeytab();
    String IMPALAD_HOST = "my_node";
    String jdbcUrl = "jdbc:impala://" + IMPALAD_HOST + ":21050;AuthMech=1;KrbRealm=MY_REALM;KrbHostFQDN=" + IMPALAD_HOST + ";KrbServiceName=impala;SSL=1";
    Properties props = new Properties();
    props.put("user", userName);
    props.put("password", "");
    try {
        Connection connection = DriverManager.getConnection(jdbcUrl, props);
        return connection;
    } catch (SQLException e) {
        throw new RuntimeException("Fail to create JDBC connection", e);
    }
}

private static void testImpalaConnection() throws IOException, SQLException {
    int n = 0;
    try (Connection cnx = createImpalaConnection()) {
        String sql = "show databases";
        try (PreparedStatement ps = cnx.prepareStatement(sql)) {
            try (ResultSet rs = ps.executeQuery()) {
                while (rs.next()) {
                    n++;
                }
            }
        }
    }
    logger.debug("{} Impala databases", n);
}

public static void main(String[] args) {
    ScheduledExecutorService scheduledExecutorService = Executors.newSingleThreadScheduledExecutor();
    scheduledExecutorService.scheduleAtFixedRate(new Runnable() {
        @Override
        public void run() {
            try {
                logger.debug("Test Hadoop connection");
                testHadoopConnexion();
                logger.debug("Hadoop connection - SUCCESS");
            } catch (Exception e) {
                logger.error("Hadoop connection - FAIL", e);
            }

            System.out.println();
            try {
                logger.debug("Test Impala connection");
                testImpalaConnection();
                logger.debug("Impala connection - SUCCESS");
            } catch (Exception e) {
                logger.error("Impala connection - FAIL", e);
            }
        }
    }, 1, 10, TimeUnit.SECONDS);
}
}

If the ticket expired or if I perform a "kdestroy" with the command line, Hadoop HDFS listing still works but Impala JDBC driver is unable to get a connection.

Why Impala cannot authenticate whereas Hadoop API calls works ? It the Hadoop API works, it means that the programmatic relogin worked and that a "fresh" ticket is there. Am i right ? Why Impala failed on its side ?

Exception stacktrace

Caused by: java.sql.SQLException: [Simba][ImpalaJDBCDriver](500168) Error creating login context using ticket cache: Unable to obtain Principal Name for authentication .
    at com.cloudera.hivecommon.api.HiveServer2ClientFactory.createTransport(Unknown Source) ~[ImpalaJDBC41.jar:ImpalaJDBC_2.5.41.1061]
    at com.cloudera.hivecommon.api.HiveServer2ClientFactory.createClient(Unknown Source) ~[ImpalaJDBC41.jar:ImpalaJDBC_2.5.41.1061]
    at com.cloudera.hivecommon.core.HiveJDBCCommonConnection.establishConnection(Unknown Source) ~[ImpalaJDBC41.jar:ImpalaJDBC_2.5.41.1061]
    at com.cloudera.impala.core.ImpalaJDBCConnection.establishConnection(Unknown Source) ~[ImpalaJDBC41.jar:ImpalaJDBC_2.5.41.1061]
    at com.cloudera.jdbc.core.LoginTimeoutConnection.connect(Unknown Source) ~[ImpalaJDBC41.jar:ImpalaJDBC_2.5.41.1061]
    at com.cloudera.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source) ~[ImpalaJDBC41.jar:ImpalaJDBC_2.5.41.1061]
    at com.cloudera.jdbc.common.AbstractDriver.connect(Unknown Source) ~[ImpalaJDBC41.jar:ImpalaJDBC_2.5.41.1061]
    at java.sql.DriverManager.getConnection(DriverManager.java:664) ~[?:1.8.0_92]
    at java.sql.DriverManager.getConnection(DriverManager.java:208) ~[?:1.8.0_92]
    ...
Caused by: com.cloudera.support.exceptions.GeneralException: [Simba][ImpalaJDBCDriver](500168) Error creating login context using ticket cache: Unable to obtain Principal Name for authentication .
    at com.cloudera.hivecommon.api.HiveServer2ClientFactory.createTransport(Unknown Source) ~[ImpalaJDBC41.jar:ImpalaJDBC_2.5.41.1061]
    at com.cloudera.hivecommon.api.HiveServer2ClientFactory.createClient(Unknown Source) ~[ImpalaJDBC41.jar:ImpalaJDBC_2.5.41.1061]
    at com.cloudera.hivecommon.core.HiveJDBCCommonConnection.establishConnection(Unknown Source) ~[ImpalaJDBC41.jar:ImpalaJDBC_2.5.41.1061]
    at com.cloudera.impala.core.ImpalaJDBCConnection.establishConnection(Unknown Source) ~[ImpalaJDBC41.jar:ImpalaJDBC_2.5.41.1061]
    at com.cloudera.jdbc.core.LoginTimeoutConnection.connect(Unknown Source) ~[ImpalaJDBC41.jar:ImpalaJDBC_2.5.41.1061]
    at com.cloudera.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source) ~[ImpalaJDBC41.jar:ImpalaJDBC_2.5.41.1061]
    at com.cloudera.jdbc.common.AbstractDriver.connect(Unknown Source) ~[ImpalaJDBC41.jar:ImpalaJDBC_2.5.41.1061]
    at java.sql.DriverManager.getConnection(DriverManager.java:664) ~[?:1.8.0_92]
    at java.sql.DriverManager.getConnection(DriverManager.java:208) ~[?:1.8.0_92]
    ...
Caused by: javax.security.auth.login.LoginException: Unable to obtain Principal Name for authentication
    at com.sun.security.auth.module.Krb5LoginModule.promptForName(Krb5LoginModule.java:841) ~[?:1.8.0_92]
    at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:704) ~[?:1.8.0_92]
    at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617) ~[?:1.8.0_92]
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_92]
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_92]
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_92]
    at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_92]
    at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755) ~[?:1.8.0_92]
    at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195) ~[?:1.8.0_92]
    at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682) ~[?:1.8.0_92]
    at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680) ~[?:1.8.0_92]
    at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_92]
    at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680) ~[?:1.8.0_92]
    at javax.security.auth.login.LoginContext.login(LoginContext.java:587) ~[?:1.8.0_92]
    at com.cloudera.jdbc.kerberos.Kerberos.getSubjectViaTicketCache(Unknown Source) ~[ImpalaJDBC41.jar:ImpalaJDBC_2.5.41.1061]
    at com.cloudera.hivecommon.api.HiveServer2ClientFactory.createTransport(Unknown Source) ~[ImpalaJDBC41.jar:ImpalaJDBC_2.5.41.1061]
    at com.cloudera.hivecommon.api.HiveServer2ClientFactory.createClient(Unknown Source) ~[ImpalaJDBC41.jar:ImpalaJDBC_2.5.41.1061]
    at com.cloudera.hivecommon.core.HiveJDBCCommonConnection.establishConnection(Unknown Source) ~[ImpalaJDBC41.jar:ImpalaJDBC_2.5.41.1061]
    at com.cloudera.impala.core.ImpalaJDBCConnection.establishConnection(Unknown Source) ~[ImpalaJDBC41.jar:ImpalaJDBC_2.5.41.1061]
    at com.cloudera.jdbc.core.LoginTimeoutConnection.connect(Unknown Source) ~[ImpalaJDBC41.jar:ImpalaJDBC_2.5.41.1061]
    at com.cloudera.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source) ~[ImpalaJDBC41.jar:ImpalaJDBC_2.5.41.1061]
    at com.cloudera.jdbc.common.AbstractDriver.connect(Unknown Source) ~[ImpalaJDBC41.jar:ImpalaJDBC_2.5.41.1061]
    at java.sql.DriverManager.getConnection(DriverManager.java:664) ~[?:1.8.0_92]
    at java.sql.DriverManager.getConnection(DriverManager.java:208) ~[?:1.8.0_92]
    ...
Comencau
  • 1,084
  • 15
  • 35
  • `checkTGTAndReloginFromKeytab()` >> cf. https://stackoverflow.com/questions/34616676/should-i-call-ugi-checktgtandreloginfromkeytab-before-every-action-on-hadoop for very deep tech details and https://stackoverflow.com/questions/33211134/hbase-kerberos-connection-renewal-strategy/33243360#33243360 for context – Samson Scharfrichter Dec 06 '17 at 21:07
  • By the way, you might have found the answer in a fraction of the time you spent writing this question, if you had inspected the source code for `UserGroupInformation` class on GitHub (no JavaDoc for Hadoop security...) – Samson Scharfrichter Dec 06 '17 at 21:10
  • @Samson Scharfrichter Thank you for your answer. Indeed, I already read those answers. I edit my question to make it clearer and to provide the full source code. It works for Hadoop API but not for Impala. I don't understand why. – Comencau Dec 07 '17 at 16:21
  • 1
    Ahhh... You are using the Cloudera JDBC driver, built with the Simba SDK, that does not use the Hadoop code base. Try raw JAAS configuration instead, cf. my answer to https://stackoverflow.com/questions/42477466/error-when-connect-to-impala-with-jdbc-under-kerberos-authrication/42506620 – Samson Scharfrichter Dec 07 '17 at 19:59
  • @Samson Scharfrichter Thank you. It works with JAAS. – Comencau Jan 12 '18 at 16:48

0 Answers0