1

I'm having a problem using Polly while trying to accomplish the following:

  • Reconnect logic - I tried to create a Polly policy which works when you try to execute StartAsync without Internet connection. However, when it reaches ReceiveLoop, the policy has no longer impact over that method and if our connection stops at that point, it never tries to reconnect back. It simply throws the following exception: Disconnected: The remote party closed the WebSocket connection without completing the close handshake.. Perhaps I should have two policies: one in StartAsync and one in ReceiveLoop, but for some reason it doesn't feel right to me, so that's why I ask the question.

  • Timeouts - I want to add timeouts for each ClientWebSocket method call e.g. ConnectAsync, SendAsync, etc. I'm not so familiar with Polly but I believe this policy automatically does that for us. However, I need someone to confirm that. By timeout, I mean similar logic to _webSocket.ConnectAsync(_url, CancellationToken.None).TimeoutAfter(timeoutMilliseconds), TimeoutAfter implementation can be found here. An example how other repos did it can be found here.

Simplified, I want to make this class resilient, which means instead of trying to connect to a dead web socket server for 30 seconds without success, no matter what the reason is, it should fail fast -> retry in 10 seconds -> fail fast -> retry again and so on. This wait and retry logic should be repeated until we call StopAsync or dispose the instance.

You can find the WebSocketDuplexPipe class on GitHub.

public sealed class Client : IDisposable
{
    private const int RetrySeconds = 10;
    private readonly WebSocketDuplexPipe _webSocketPipe;
    private readonly string _url;

    public Client(string url)
    {
        _url = url;
        _webSocketPipe = new WebSocketDuplexPipe();
    }

    public Task StartAsync(CancellationToken cancellationToken = default)
    {
        var retryPolicy = Policy
            .Handle<Exception>(e => !cancellationToken.IsCancellationRequested)
            .WaitAndRetryForeverAsync(_ => TimeSpan.FromSeconds(RetrySeconds),
                (exception, calculatedWaitDuration) =>
                {
                    Console.WriteLine($"{exception.Message}. Retry in {calculatedWaitDuration.TotalSeconds} seconds.");
                });

        return retryPolicy.ExecuteAsync(async () =>
        {
            await _webSocketPipe.StartAsync(_url, cancellationToken).ConfigureAwait(false);
            _ = ReceiveLoop();
        });
    }

    public Task StopAsync()
    {
        return _webSocketPipe.StopAsync();
    }

    public async Task SendAsync(string data, CancellationToken cancellationToken = default)
    {
        var encoded = Encoding.UTF8.GetBytes(data);
        var bufferSend = new ArraySegment<byte>(encoded, 0, encoded.Length);
        await _webSocketPipe.Output.WriteAsync(bufferSend, cancellationToken).ConfigureAwait(false);
    }

    private async Task ReceiveLoop()
    {
        var input = _webSocketPipe.Input;

        try
        {
            while (true)
            {
                var result = await input.ReadAsync().ConfigureAwait(false);
                var buffer = result.Buffer;

                try
                {
                    if (result.IsCanceled)
                    {
                        break;
                    }

                    if (!buffer.IsEmpty)
                    {
                        while (MessageParser.TryParse(ref buffer, out var payload))
                        {
                            var message = Encoding.UTF8.GetString(payload);

                            _messageReceivedSubject.OnNext(message);
                        }
                    }

                    if (result.IsCompleted)
                    {
                        break;
                    }
                }
                finally
                {
                    input.AdvanceTo(buffer.Start, buffer.End);
                }
            }
        }
        catch (Exception ex)
        {
            Console.WriteLine($"Disconnected: {ex.Message}");
        }
    }
}
Peter Csala
  • 17,736
  • 16
  • 35
  • 75
nop
  • 4,711
  • 6
  • 32
  • 93
  • So, what is your question? – Peter Csala Mar 16 '22 at 19:27
  • @PeterCsala, Q1: I want to apply the same policy for `ReceiveLoop`, because it doesn't reconnect if it reaches that part. Q2: I wonder if the timeouts are automatically taken care of by Polly. I described what timeout behavior I expect (TimeoutAfter), but with Polly. – nop Mar 16 '22 at 19:32
  • Your retry policy will exit from the ExecuteAsync with success while you wait for the input.ReadAsync to finish. Because you are not awaiting the ReceiveLoop rather you just kick it off in a fire and forget manner, that's why your policy will not have any affect on it. – Peter Csala Mar 16 '22 at 22:26
  • 1
    Btw you can and should pass the cancellationToken parameter to the ExecuteAsync. – Peter Csala Mar 16 '22 at 22:30
  • @PeterCsala, thanks, I can pass a CancellationToken, but If I await ReceiveLoop, it blocks the UI. Is there any Polly way to unblock the thread or should I just `Task.Run`? – nop Mar 17 '22 at 00:12
  • Well, the thing is if you want to cover it properly with retry then either you have to move the retry logic inside the `ReceiveLoop` or as you said you can move this operation onto a dedicated thread. – Peter Csala Mar 17 '22 at 12:50
  • 1
    For your Timeout question: Polly's Timeout policy can be setup to use optimistic or pessimistic strategy. [The former one](https://github.com/App-vNext/Polly/wiki/Timeout#optimistic-timeout) heavily relies on the `CancellationToken`. So if you pass `CancellationToken.None` to the `ExecuteAsync` then it will create a token for you and uses that to cancel the operation whenever the timeout is reached. BUT please be aware that it will throw `TimeoutRejectedException`. – Peter Csala Mar 17 '22 at 12:58
  • Did I answer your questions or do you need further clarification? – Peter Csala Mar 17 '22 at 16:08
  • @PeterCsala, I honestly don't know. https://pastebin.com/3HavmN6z. No idea how to print a message when a timeout occurs or something. I'm able to see that it reconnects in 30 seconds. – nop Mar 17 '22 at 16:17
  • The TimeoutAsync has an overload which anticipates an onTimeoutAsync delegate. That delegate is called whenever a timeout occurs. – Peter Csala Mar 17 '22 at 16:53
  • @PeterCsala, look at what I did https://pastebin.com/M1ZCHkff. Here is the log from the execution: https://pastebin.com/GzzXJTQG. The timeout is never triggered, maybe because the `WaitAndRetryForever` policy overrides it? If you also pay attention to the log, when it actually reaches `ReceiveLoop`, it *will* try to connect, but it won't succeed, probably because the web socket is no longer alive. – nop Mar 17 '22 at 20:08
  • 1
    The problem is your `ExecuteAsync` since you are calling it without passing the cancellation token, like this: https://dotnetfiddle.net/YfTomG – Peter Csala Mar 18 '22 at 08:13
  • @PeterCsala, I see what the problem is. It works great if I disconnect W-Fi before we establish the web socket connection but when it starts the web socket connection and it is basically looping through `ReceiveLoop` and then I disconnect Wi-Fi, it throws `WebSocketException` inside the wrapper, which it handles by itself. It never tries to reconnect back, because there is no open web socket connection and the pipes are completed. https://github.com/ninjastacktech/ninja-websocket-net/blob/master/src/Ninja.WebSocketClient/Transport/WebSocketDuplexPipe.cs#L154. Maybe I should change the wrapper? – nop Mar 18 '22 at 11:41
  • @nob I don't have too much experience with WebSocket, so I can not help on that end. If you have question regarding Polly I'm here to help you. – Peter Csala Mar 18 '22 at 11:54
  • @PeterCsala, I understand. You can write it as answer, so I can accept it. If it's possible, could you please edit your example with a web socket url, so I know it works for sure for it i.e. `wss://url`, instead of HttpClient. – nop Mar 18 '22 at 11:58
  • Okay, I will do it during the weekend. :) – Peter Csala Mar 18 '22 at 17:19
  • Did I miss something from my post? – Peter Csala Mar 21 '22 at 07:27
  • 1
    @PeterCsala, oh thanks for asking, because I didn't notice the answer. Looking forward it asap. – nop Mar 21 '22 at 07:38

1 Answers1

3

Let me capture in an answer the essence of our conversation via comments.

ReceiveLoop with retry

Your retry policy will exit with success from the ExecuteAsync while you are waiting for the input.ReadAsync to finish. The reason is that you are not awaiting the ReceiveLoop rather you just kick it off in a fire and forget manner.

In other words, your retry logic will only apply for the StartAsync and the code before the await inside the ReceiveLoop.

The fix is to move the retry logic inside ReceiveLoop.

Timeout

Polly's Timeout policy can use either optimistic or pessimistic strategy. The former one heavily relies on the CancellationToken.

  • So, if you pass for example CancellationToken.None to the ExecuteAsync then you basically says let TimeoutPolicy handle cancellation process.
  • If you pass an already existing token then the decorated Task can be cancelled by the TimeoutPolicy or by the provided token.

Please bear in mind that it will throw TimeoutRejectedException not OperationCanceledException.

onTimeoutAsync

TimeoutAsync has several overloads which can accept one of the two onTimeoutAsync delegates

Func<Context, TimeSpan, Task, Task> onTimeoutAsync

or

Func<Context, TimeSpan, Task, Exception, Task> onTimeoutAsync

That can be useful to log the fact the timeout has occurred if you have an outer policy (for example a retry) which triggers on the TimeoutRejectedException.

Chaining policies

I suggest to use the Policy.WrapAsync static method instead of the WrapAsync instance method of the AsyncPolicy.

var timeoutPolicy = Policy.TimeoutAsync(TimeSpan.FromMilliseconds(timeoutMs), TimeoutStrategy.Optimistic,
    (context, timeSpan, task, ex) =>
    {
        Console.WriteLine($"Timeout {timeSpan.TotalSeconds} seconds");
        return Task.CompletedTask;
    });

var retryPolicy = Policy
    .Handle<Exception>(ex =>
    {
        Console.WriteLine($"Exception tralal: {ex.Message}");
        return true;
    })
    .WaitAndRetryForeverAsync(_ => TimeSpan.FromMilliseconds(retryBackOffMs),
    (ex, retryCount, calculatedWaitDuration) =>
    {
        Console.WriteLine(
            $"Retrying in {calculatedWaitDuration.TotalSeconds} seconds (Reason: {ex.Message}) (Retry count: {retryCount})");
    });

var resilientStrategy = Policy.WrapAsync(retryPolicy, timeoutPolicy);

With this approach your retry policy's definition does not refer to the timeout policy explicitly. Rather you have two separate policies and a chained one.

Peter Csala
  • 17,736
  • 16
  • 35
  • 75
  • Thank you for your answer! I wrote you a message on StackOverflow chat – nop Mar 21 '22 at 08:26
  • 1
    I want to express my special thankfulness to this guy. If I could give him more reputation, I would have done it, but it's not possible. – nop Mar 22 '22 at 09:33
  • Hey, may you have a look at the StackOverflow chat? – nop May 03 '22 at 07:05