Enable Observability (Logging/Metrics) of TLS Handshakes on Embedded Tomcat 8.5 with Java 8

Question

We are running Spring Boot APIs where we terminate TLS in the API itself. Several times we have observed excessive CPU usage after extensive searches were caused by someone creating many connections (legitimately or erroneously because of rejected client certs) or not using TLS resumption.

To prevent these long and costly searches in the future, we would like to log when handshake fails or succeed and why and whether session resumption is being used.

We are not specifically tied to our current stack, and upgrading to a different server like Undertow or WebFlux, and/or a new version of Java would be fine as well. Similarly, we are fine using APR, NIO, or native bindings to achieve these goals.

The following other questions suggest that currently, there is no out of the box solution. They suggest extending JSSEImplementation or create customized SSL Socket Factory, or turning the level of the NIO adapter to Debug. These solutions feel fragile, and I wonder whether there is a more extendable mechanism based on events or callbacks. Alternatively, we could enable the handshake logs from Java, but those are verbose, and we would incur a significant performance hit when doing so.

Update1: I've tried to go the route of using a customized SSLServerSocketFactory. The sun.security.ssl.SSLServerSocketFactoryImpl returns a sun.security.ssl.SSLServerSocketImpl on bind which returns a nice SSLSocket on accept. I could wrap that accept method always to add a completion handler. The only drawback is: SSLServerSocketFactoryImpl is final, so I cannot just wrap it. This means I need to copy a lot of code, and it still would only give me metrics on successful handshakes. Copying the code would be a maintenance burden because this is JRE specific code.

_for example event or callbacks_ : This has been asked already on SO without any good answer from what I remember. You also give the solutions that were mentionned (extending JSSE, implement a Factory) — Eugène Adell, Sep 12 '18 at 08:29
@EugèneAdell, indeed, it has been asked before. But one question was a year ago and the other originated in 2008. A lot can happen in that time. — Alessandro Vermeulen, Sep 12 '18 at 09:42
The point is that JSSE was designed to isolate the application from the underlying SSL/TLS handling. It's a good idea, as it makes the whole thing pluggable with different providers, but this side effect of not having easy access to call-backs is penalizing statistics collection (protocols, cipher suites used by the clients, error causes). Designers didn't want to know why clients cannot negotiate with the server, they believe these clients are old/broken and not interesting. I checked what OpenSSL could do, but it's not better. — Eugène Adell, Sep 12 '18 at 11:16
Interestingly, a [HandshakeCompletedEvent.html](https://docs.oracle.com/javase/8/docs/api/javax/net/ssl/HandshakeCompletedEvent.html) does exist. If I can myself hooked into that I can at least see the successful connects and by keeping track of the session IDs I can track resumptions. — Alessandro Vermeulen, Sep 12 '18 at 12:07
Yes, this event will raise on **successful** handshakes **only**. I used that on client-side for my [client](https://sourceforge.net/projects/jtouch/?source=directory). Server-side, it's much more interesting to find out failure causes, in my opinion. — Eugène Adell, Sep 12 '18 at 12:24

score 0 · Answer 1 · answered Oct 23 '20 at 07:43

My answer might be not what you expect, but it is what I would have done myself.

First of all, I never enable SSL on custom software. Neither Java, nor C#, nor Python, nor Javascript. In all my solutions they run over plain HTTP.

All TLS stuff I delegate to NGINX. It is reliable. It is fast. It has tons of options. It has versatile and detailed log. It has some basic access control and DDoS protections. It encapsulates the details of deployment and provides a single facade to multiple provided services.

The overhead is small, and it runs well even on modest hardware.

You need two features: reverse proxy and detailed logging.

The simplest config files looks like this:

server {
        listen 443 ssl;

        server_name example.com;
        ssl_protocols       TLSv1 TLSv1.1 TLSv1.2;

        location / {
                # Transfer all request to the actual server using HTTP
                proxy_pass http://<server-in-intranet>:12345;
                proxy_set_header Host $host;
        }
        # TLS handshake errors are reported at the info level
        error_log /var/log/nginx/example.com/error.log info;
        # Extra ideas about SSL logging: 
        #   https://docs.nginx.com/nginx/admin-guide/monitoring/logging/#tls_sample

        # The certificates from Let's Encrypt are installed by Certbot
        ssl_certificate /etc/letsencrypt/live/example.com/fullchain.pem; # managed by Certbot
        ssl_certificate_key /etc/letsencrypt/live/example.com/privkey.pem; # managed by Certbot
}

With this config the server https://example.com/ serves the content of your actual server, that runs somewhere inside the intranet, over HTTPS while the actual server is plain HTTP.

Using this setup I run servers written in Go, Javascript and Python, that run on different machines, but collected under a single point of access, e.g. https://global.name/service1/, https://global.name/service2/, https://global.name/service3/

This add more complexity to the setup and yet another hop. Tomcat provides OpenSSL support out of the box which covers > 80% of all use cases. — Michael-O, Oct 23 '20 at 08:36

score 0 · Answer 2 · answered Oct 29 '20 at 01:52

Is this a lone server or a collection of servers that are behind a load balancer?

You may consider "redeploying" the server such that you have a duplicate with the same config but with JAVA OPT debug ssl:handshake enabled.

Now on the load balancer you direct a portion of the traffic to your debugging server to sample the activity you are interested in.

Alternatively you deploy another instance of tomcat on the same server on a different port that has the debug turned on. ( This less than idea las it puts added load on a server that you mention might be in trouble already in times of increased requests. )

So maybe you don't have a load balancer, but you probably have a firewall, see if you firewall is stateful and can "split traffic" for you.

If the current server is a linux server you can use iptables to do this in the "dual local install" example I mention above. something like this: https://www.webair.com/community/simple-stateful-load-balancer-with-iptables-and-nat/

There is no getting around a complex solution.

If you don't have a load balancer you might want to consider it as it provides you with lots of flexibility to deal with various situation not just this.

Good luck

David

Enable Observability (Logging/Metrics) of TLS Handshakes on Embedded Tomcat 8.5 with Java 8

2 Answers2