1

Not sure what is happening since there are multiple moving parts here. We have a cloudera cluster for hdfs, hadoop, impala, hbase. We also have an F5 loadbalancer in front of all our impala servers. We are trying to secure the servers/cluster with Kerberos. My colleague has setup Kerberos using MIT KDC. This setup works fine when we query impala directly to the server but not when we go thru an F5 load balancer.

We've run kinit to get a ticket for a pre-created keytab file.

kinit -k -t /blah/keytabs/first.last.keytab first.last

When I run klist, it shows all these tickets:

$ klist
Ticket cache: FILE:/tmp/krb5cc_14377
Default principal: first.last@MADEUPNAME

Valid starting     Expires            Service principal
08/23/17 11:32:02  08/24/17 11:32:02  krbtgt/MADEUPNAME@MADEUPNAME
    renew until 08/23/17 11:32:02
08/23/17 11:33:39  08/24/17 11:32:02  impala/hslave32101.company.com@MADEUPNAME
    renew until 08/23/17 11:32:02

When I run my impala-shell command, it works fine:

$ impala-shell -i hslave32101.company.com:21000 -k -q "select 1"
Starting Impala Shell using Kerberos authentication
Using service name 'impala'
Connected to hslave32101.company.com:21000
Server version: impalad version 2.7.0-cdh5.9.2 RELEASE (build 2f7871169d894fab16f8a2fb99f2e34f0df8763d)
Query: select 1
Query submitted at: 2017-08-23 13:08:34 (Coordinator: http://hslave32101.company.com:25000)
Query progress can be monitored at: http://hslave32101.company.com:25000/query_plan?query_id=4940ca8ca2f267c5:5eeb29af00000000
+---+
| 1 |
+---+
| 1 |
+---+
Fetched 1 row(s) in 0.01s

However, when I run my command thru the F5 loadbalancer, it doesn't work because the ticket it's looking for doesn't match what's in klist because it replaced part of it for some reason.

impala-shell -i bdaudit.company.com:21000 -d bigdata -k -q "select 1"
Starting Impala Shell using Kerberos authentication
Using service name 'impala'
Error connecting: TTransportException, Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure.  Minor code may provide more information (Server krbtgt/COMPANY.COM@MADEUPNAME not found in Kerberos database)
Not connected to Impala, could not execute queries.

The problem is this line here

(Server krbtgt/COMPANY.COM@MADEUPNAME not found in Kerberos database)

Somehow when going thru the F5 VIP, it changes first.last@MADEUPNAME to COMPANY.COM@MADEUPNAME. Does anyone know why it replaced this part of the ticket?

Classified
  • 5,759
  • 18
  • 68
  • 99
  • 1/2 "My colleague has setup Kerberos using MIT KDC. This setup works fine when we query impala directly to the server but not when we go thru an F5 load balancer." --> So, just a quick question on that. Is the F5 configured for pass-through to your app server or is it configured as reverse-proxy? I think it must be configured as a pass-through load-balancer in order for Kerberos to work, otherwise the authentication traffic will get stopped and dropped at the F5 without further configuration at the F5. I'm taking an educated guess here. – T-Heron Aug 24 '17 at 01:18
  • 2/2 "However, when I run my command thru the F5 loadbalancer, it doesn't work because the ticket it's looking for doesn't match what's in klist because it replaced part of it for some reason...Somehow when going thru the F5 VIP, it changes first.last@MADEUPNAME to COMPANY.COM@MADEUPNAME. Does anyone know why it replaced this part of the ticket?" --> These statements seems to confirm your F5 is acting as a reverse proxy towards the impala. Can you confirm or deny this? I don't think it will work without further configuration at the F5. – T-Heron Aug 24 '17 at 01:21
  • @T-Heron, thx for your interest in my question. The F5 is setup as a pass thru. We got it working after finding Cloudera's documentation on how to do this. We still don't know why the name changed in the error, other than a hunch that the KDC was trying to find some resource called bdaudit.company.com, couldn't since resources are hslave33333.company.com and for some reason, truncated the name to be company.com@MADEUPNAME – Classified Aug 24 '17 at 20:11

1 Answers1

2

Found the reason from Cloudera's instructions on how to setup Impala with an F5 here and here

Here's the snippet from the PDF:

In Cloudera Manager, navigate to the Impala service, select the Configuration pane, then search for “balancer” to
find the Impala Daemons Load Balancer parameter. The load balancer should be specified in host:port format,
where host is your virtual server’s FQDN and port. These values are used by Cloudera Manager and are also passed
to Hue

If the Impala Daemons Load Balancer parameter is specified and Kerberos is enabled, Cloudera Manager adds a
principal for 'impala/<load_balancer_host>@<realm>' to the keytab for all Impala daemons. No additional
configuration is required for Kerberos.
Classified
  • 5,759
  • 18
  • 68
  • 99