Is it okay to change socket from non-blocking to blocking and then non-blocking again?

Question

The whole logic of my application works with non-blocking sockets, but in the connection phase I found it better to make the socket blocking, before performing the SSL handshake with SSL_connect(). This is because otherwise it created a busy loop until the handshake finished successfully, and actually blocking the socket until then should be more efficient.

Here is the pseudo-code to my connect logic:

bool connect(host)
{
    int socket;
    init_socket(socket);

    set (socket, NONBLOCKING);
    connect_with_timeout(socket, host, 2s);
    if (timeout_failed || connect_errors) return false;

    set (socket, BLOCKING);
    SSL_connect (socket);
    if (ssl_connect_errors) return false;

    set (socket, NONBLOCKING);
    return true;
}

And the SSL handshake on non-blocking sockets looks like this:

do
{
    SSL_connect(socket);
}
while (!SSL_connection_errors);

Is it considered a bad practice to change socket type like this ? What really happens on a low level when you do it ?

I know this seems like a micro-improvement in performance, but I want to get it right, as my application may attempt to reconnect once every 30 seconds and the user gets the occasional less than 1s big CPU spike.

Edit: The answers I got on this question made me see how attempting read on non-blocking sockets and then sleeping is not really a good idea. However, I do need a platform-independent solution to poll so I already went ahead and added a select with a 1s timeout on read operations, tested for both Windows and Linux and CPU usage is lower now. Thanks.

There's nothing wrong with changing the state of the socket. But why not just correctly do the handshake without blocking? If the `SSL_connect` would block, call `SSL_get_error` and correctly handle `SSL_ERROR_WANT_READ` and `SSL_ERROR_WANT_WRITE`. — David Schwartz, Sep 02 '15 at 09:45
Doing that will result in that busy loop I'm trying to avoid. — Catalin, Sep 02 '15 at 09:56
No, it won't, for the same reason your reads and writes don't result in busy loops. `SSL_connect` works just like `SSL_read`. [See here](http://stackoverflow.com/a/4124991/721269). — David Schwartz, Sep 02 '15 at 10:06
Reads do result in busy loops, so I have to sleep the thread for a few microseconds every time I get an EWOULDBLOCK or SSL_WANT_READ / WRITE. — Catalin, Sep 02 '15 at 10:08
Oh, then you asked the wrong question. You don't understand how to do non-blocking socket operations at all. — David Schwartz, Sep 02 '15 at 10:09
Sleeping for a few microseconds is functionally indistinguishable from using blocking mode, except that on average you are likely to sleep twice as long as necessary. — user207421, Sep 02 '15 at 11:20

David Schwartz · Accepted Answer · 2015-09-02T10:38:54.310

4

It appears that this is an XY problem. Your real issue is that you don't understand how to do non-blocking socket operations. For OpenSSL, when a non-blocking operation would have blocked, you should call SSL_get_error. If the error is SSL_ERROR_WANT_READ you should retry the operation when the socket is readable. If the error is SSL_ERROR_WANT_WRITE you should retry the operation when the socket is writable.

The next question is -- how do you wait for a socket to become readable or writable? That depends on your platform. For Linux, you can use poll, which also takes a timeout so you can control how long you wait.

Keep the socket non-blocking. Call SSL_connect. If the call would have blocked, call SSL_get_error, you will get SSL_ERROR_WANT_READ or SSL_ERROR_WANT_WRITE. Use poll to wait for the socket to become readable/writable. If you get a timeout or error, handle it appropriately. If the socket became readable/writable, call SSL_connect again.

You should handle SSL_read and SSL_write the same way.

edited Sep 02 '15 at 10:38

answered Sep 02 '15 at 10:13

David Schwartz

179,497
17
214
278

You seem to know a lot about openssl :) Could you advise on the difference in performance between blocking and non-blocking for a client calling `ssl_read()`? I presumed blocking would be faster because otherwise you've got to continually traverse through the openssl stack, through the ssl_lib.c, through bio, to `ReadSocket()` (i know the receive code a little) return back to the application and keep doing this. In contrast blocking just waits at the socket. Very interested in your knowledge on the matter. – user997112 Aug 19 '22 at 02:02
1

@user997112 If the only thing you could possibly do is make forward progress on just one socket, then yes, blocking is faster. But that's very unlikely to be a situation where you care much about performance anyway. – David Schwartz Aug 19 '22 at 02:53
Thanks! Got one other question. I've been through the source code from `SSL_read()` to `ReadSocket()`/`read()`. I'm struggling to understand around `ssl3_read_bytes()` in `ssl3_record.c` and `ssl3_get_record()` in `rec_layer_s3.c`. Are there any design documents? Basically I just need to visualize what's happening to understand the code. I'm struggling to understand those parts. I know there's a "record" entity but that doesn't explain all the code. – user997112 Aug 21 '22 at 08:22

Is it okay to change socket from non-blocking to blocking and then non-blocking again?

1 Answers1