1

I'm attempting to query the public NPPES NPI registry using Flurl as my HTTP client library, version 2.4.2.

After 4-5 successful async requests (debugging by stepping through each request in the loop) it always fails with a SocketException that the connection was forcibly closed. I'm assuming the API is rate limited, but slowing down the requests doesn't seem to be working. Here is my relevant code.

static async Task<Result> GetNpiEntry(string npiNumber)
{
    var npiEntry = await "https://npiregistry.cms.hhs.gov"
        .AppendPathSegment("api/")
        .SetQueryParams(new { version = "2.1", number = npiNumber }).GetJsonAsync<RootObject>();

    return npiEntry.results[0];
}

and the loop calling it with a hefty sleep between requests.

List<Result> npiResults = new List<Result>(npiTable.Rows.Count);

foreach (DataRow row in npiTable.Rows)
{
    Result npiEntry = Task.Run(() => GetNpiEntry((string)row[0])).Result;
    npiResults.Add(npiEntry);
    Thread.Sleep(2000);
}

Here's the actual exception.

 ---> System.Net.Http.HttpRequestException: An error occurred while sending the request.
 ---> System.IO.IOException: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host..
 ---> System.Net.Sockets.SocketException (10054): An existing connection was forcibly closed by the remote host.  

Is there some more appropriate way for me to debug or do this? I imagine I'd need to rate limit the client, but shouldn't a generous wait time between requests work at least for debugging?

Nate Barbettini
  • 51,256
  • 26
  • 134
  • 147
tNAPA
  • 13
  • 3
  • Welcome to Stack Overflow! Thanks for posting your code and detailed error info with your question, it makes it easier to answer. :) – Nate Barbettini Jan 02 '20 at 18:17
  • Can you make your `Task.Run` lambda `async` and then `await` the call to `GetNpiEntry`. Ie: `Task.Run( async ( ) => await GetNpiEntry` – WBuck Jan 02 '20 at 19:00
  • @WBuck I can't add that as it fails to compile, the foreach loop is nestled within the ```static void Main(string[] args) {}``` entry point. Adding another sleep before the Task.Run() call allows it to complete ~1000 requests before hitting this error, but it hits it none the less. – tNAPA Jan 02 '20 at 19:06
  • I'm posting an answer give me a second – WBuck Jan 02 '20 at 19:07
  • WBuck's answer is correct. I added some detail and a repro in mine. Hope that's helpful @tNAPA! – Nate Barbettini Jan 02 '20 at 19:22

2 Answers2

1

You're not awaiting the call to GetNpiEntry in the Task.Run lambda. If the call to GetNpiEntry executed synchronously then you have no problems. If the call to GetNpiEntry executes asynchronously then the Task.Run lambda will not await its result.

Try the following instead:

foreach (DataRow row in npiTable.Rows)
{
    Result npiEntry = Task.Run( async () => await GetNpiEntry((string)row[0])).GetAwaiter( ).GetResult( );
    npiResults.Add(npiEntry);
    Thread.Sleep(2000);
}

I also see you're using .NetCore so you should be able to use the following main instead.

static async Task Main( string[ ] args )

This allows you to use await within Main .

WBuck
  • 5,162
  • 2
  • 25
  • 36
  • Thanks that does work for processing the results, however after (in this case 101) requests the socket is closed by the remote host and the exception is thrown. How should I deal with reopening a new socket when this occurs, and more directly why does this fail while I can retrieve approximately 2000 NPIs (sequentially) without issue in ~ 2 minutes through a less programmatic client, but waiting several seconds between requests is bound to fail after a few minutes in this C# version. – tNAPA Jan 02 '20 at 19:25
  • 1
    Network exceptions happen. Its just one of those things that can and do often fail. It's up to you to `catch` the `Exception`. Once caught you need to decide how you want to recover from it. Often I'll have "retry" logic which will attempt the same call `N` times on a failure. Usually I first try to re-connect with the `remote` and upon successfully connecting I will re-issue my request. – WBuck Jan 02 '20 at 19:36
  • A simple try {} catch { wait and retry } seems like it might be enough. Thank you. – tNAPA Jan 02 '20 at 19:52
0

The problem is not Flurl or API throttling, but blocking on an async call (as @WBuck said).

The best practice is to avoid .Result and even .GetAwaiter().GetResult(). Those and Task.Run() are all examples of mixing sync and async code. There are good articles online for why this is bad - search for .net mix sync async for more background info.

The correct solution is to use async "all the way down". Pretty much every entry point can be marked async now, including console apps.

This works on my machine even without the sleeps:

using System;
using System.Collections.Generic;
using System.Threading.Tasks;
using Flurl;
using Flurl.Http;

namespace SO_59567958
{
    class Program
    {
        static async Task Main(string[] args)
        {
            var ids = new[] { "1023086709", "1659325215", "1912946427", "1740219450", "1750497640", "1538260823", "1275625626", "1144303488", "1205919107", "1730281890", "1568453561" };
            Console.WriteLine($"Retrieving {ids.Length} NPI records...");

            var npiResults = new List<Result>();
            foreach (var id in ids)
            {
                var retrievedFromApi = await GetNpiEntry(id);
                npiResults.Add(retrievedFromApi);
            }

            Console.WriteLine("Done");
            npiResults.ForEach(x => Console.WriteLine(x.Number));
        }

        static async Task<Result> GetNpiEntry(string npiNumber)
        {
            var npiEntry = await "https://npiregistry.cms.hhs.gov"
                .AppendPathSegment("api/")
                .SetQueryParams(new { version = "2.1", number = npiNumber })
                .GetJsonAsync<RootObject>();

            return npiEntry.Results[0];
        }
    }

    public class RootObject
    {
        public int ResultCount { get; set; }

        public IReadOnlyList<Result> Results { get; set; }
    }

    public class Result
    {
        public int Number { get; set; }
    }
}
Nate Barbettini
  • 51,256
  • 26
  • 134
  • 147
  • Thank you for explaining how to do the await async calls correctly. However this example still fails after enough iterations with the same socket error. I have a less useful implementation of this (VB.NET generated by another program) that could complete querying the entire data set (~2000 entries for this) in about 2 minutes without error or any sleeps, but the connection is forcibly closed using this after a few dozen queries. – tNAPA Jan 02 '20 at 19:32
  • I think I'm able to handle it well enough by just catching the exception and waiting before trying again, thank you. – tNAPA Jan 02 '20 at 19:53
  • You're welcome! For more complex retry policies, check out: http://www.thepollyproject.org/ – Nate Barbettini Jan 02 '20 at 20:23