0

The whole logic of my application works with non-blocking sockets, but in the connection phase I found it better to make the socket blocking, before performing the SSL handshake with SSL_connect(). This is because otherwise it created a busy loop until the handshake finished successfully, and actually blocking the socket until then should be more efficient.

Here is the pseudo-code to my connect logic:

bool connect(host)
{
    int socket;
    init_socket(socket);

    set (socket, NONBLOCKING);
    connect_with_timeout(socket, host, 2s);
    if (timeout_failed || connect_errors) return false;

    set (socket, BLOCKING);
    SSL_connect (socket);
    if (ssl_connect_errors) return false;

    set (socket, NONBLOCKING);
    return true;
}

And the SSL handshake on non-blocking sockets looks like this:

do
{
    SSL_connect(socket);
}
while (!SSL_connection_errors);

Is it considered a bad practice to change socket type like this ? What really happens on a low level when you do it ?

I know this seems like a micro-improvement in performance, but I want to get it right, as my application may attempt to reconnect once every 30 seconds and the user gets the occasional less than 1s big CPU spike.

Edit: The answers I got on this question made me see how attempting read on non-blocking sockets and then sleeping is not really a good idea. However, I do need a platform-independent solution to poll so I already went ahead and added a select with a 1s timeout on read operations, tested for both Windows and Linux and CPU usage is lower now. Thanks.

Catalin
  • 474
  • 4
  • 19
  • 2
    There's nothing wrong with changing the state of the socket. But why not just correctly do the handshake without blocking? If the `SSL_connect` would block, call `SSL_get_error` and correctly handle `SSL_ERROR_WANT_READ` and `SSL_ERROR_WANT_WRITE`. – David Schwartz Sep 02 '15 at 09:45
  • Doing that will result in that busy loop I'm trying to avoid. – Catalin Sep 02 '15 at 09:56
  • 2
    No, it won't, for the same reason your reads and writes don't result in busy loops. `SSL_connect` works just like `SSL_read`. [See here](http://stackoverflow.com/a/4124991/721269). – David Schwartz Sep 02 '15 at 10:06
  • Reads do result in busy loops, so I have to sleep the thread for a few microseconds every time I get an EWOULDBLOCK or SSL_WANT_READ / WRITE. – Catalin Sep 02 '15 at 10:08
  • 2
    Oh, then you asked the wrong question. You don't understand how to do non-blocking socket operations at all. – David Schwartz Sep 02 '15 at 10:09
  • Sleeping for a few microseconds is functionally indistinguishable from using blocking mode, except that on average you are likely to sleep twice as long as necessary. – user207421 Sep 02 '15 at 11:20

1 Answers1

4

It appears that this is an XY problem. Your real issue is that you don't understand how to do non-blocking socket operations. For OpenSSL, when a non-blocking operation would have blocked, you should call SSL_get_error. If the error is SSL_ERROR_WANT_READ you should retry the operation when the socket is readable. If the error is SSL_ERROR_WANT_WRITE you should retry the operation when the socket is writable.

The next question is -- how do you wait for a socket to become readable or writable? That depends on your platform. For Linux, you can use poll, which also takes a timeout so you can control how long you wait.

Keep the socket non-blocking. Call SSL_connect. If the call would have blocked, call SSL_get_error, you will get SSL_ERROR_WANT_READ or SSL_ERROR_WANT_WRITE. Use poll to wait for the socket to become readable/writable. If you get a timeout or error, handle it appropriately. If the socket became readable/writable, call SSL_connect again.

You should handle SSL_read and SSL_write the same way.

David Schwartz
  • 179,497
  • 17
  • 214
  • 278
  • You seem to know a lot about openssl :) Could you advise on the difference in performance between blocking and non-blocking for a client calling `ssl_read()`? I presumed blocking would be faster because otherwise you've got to continually traverse through the openssl stack, through the ssl_lib.c, through bio, to `ReadSocket()` (i know the receive code a little) return back to the application and keep doing this. In contrast blocking just waits at the socket. Very interested in your knowledge on the matter. – user997112 Aug 19 '22 at 02:02
  • 1
    @user997112 If the only thing you could possibly do is make forward progress on just one socket, then yes, blocking is faster. But that's very unlikely to be a situation where you care much about performance anyway. – David Schwartz Aug 19 '22 at 02:53
  • Thanks! Got one other question. I've been through the source code from `SSL_read()` to `ReadSocket()`/`read()`. I'm struggling to understand around `ssl3_read_bytes()` in `ssl3_record.c` and `ssl3_get_record()` in `rec_layer_s3.c`. Are there any design documents? Basically I just need to visualize what's happening to understand the code. I'm struggling to understand those parts. I know there's a "record" entity but that doesn't explain all the code. – user997112 Aug 21 '22 at 08:22