8

I'm currently testing out C# 8's async streams, and it seems that when I try to run the application using the old pattern of of using async/await and returning Task> it seems to be faster. (I measured it using a stopwatch and tried running it multiple times, and the result was that the old pattern I mentioned seems somewhat faster than using IAsyncEnumerable).

Here's a simple Console App that I wrote (I'm also thinking perhaps I'm loading the data from database the wrong way)

class Program
    {
        static async Task Main(string[] args)
        {

            // Using the old pattern 
            //Stopwatch stopwatch = Stopwatch.StartNew();
            //foreach (var person in await LoadDataAsync())
            //{
            //    Console.WriteLine($"Id: {person.Id}, Name: {person.Name}");
            //}
            //stopwatch.Stop();
            //Console.WriteLine(stopwatch.ElapsedMilliseconds);


            Stopwatch stopwatch = Stopwatch.StartNew();
            await foreach (var person in LoadDataAsyncStream())
            {
                Console.WriteLine($"Id: {person.Id}, Name: {person.Name}");
            }
            stopwatch.Stop();
            Console.WriteLine(stopwatch.ElapsedMilliseconds);


            Console.ReadKey();
        }


        static async Task<IEnumerable<Person>> LoadDataAsync()
        {
            string connectionString = "Server=localhost; Database=AsyncStreams; Trusted_Connection = True;";
            var people = new List<Person>();
            using (SqlConnection connection = new SqlConnection(connectionString))
            {
                //SqlDataReader
                await connection.OpenAsync();

                string sql = "Select * From Person";
                SqlCommand command = new SqlCommand(sql, connection);

                using (SqlDataReader dataReader = await command.ExecuteReaderAsync())
                {
                    while (await dataReader.ReadAsync())
                    {
                        Person person = new Person();
                        person.Id = Convert.ToInt32(dataReader[nameof(Person.Id)]);
                        person.Name = Convert.ToString(dataReader[nameof(Person.Name)]);
                        person.Address = Convert.ToString(dataReader[nameof(Person.Address)]);
                        person.Occupation = Convert.ToString(dataReader[nameof(Person.Occupation)]);
                        person.Birthday = Convert.ToDateTime(dataReader[nameof(Person.Birthday)]);
                        person.FavoriteColor = Convert.ToString(dataReader[nameof(Person.FavoriteColor)]);
                        person.Quote = Convert.ToString(dataReader[nameof(Person.Quote)]);
                        person.Message = Convert.ToString(dataReader[nameof(Person.Message)]);

                        people.Add(person);
                    }
                }

                await connection.CloseAsync();
            }

            return people;
        }

        static async IAsyncEnumerable<Person> LoadDataAsyncStream()
        {
            string connectionString = "Server=localhost; Database=AsyncStreams; Trusted_Connection = True;";
            using (SqlConnection connection = new SqlConnection(connectionString))
            {
                //SqlDataReader
                await connection.OpenAsync();

                string sql = "Select * From Person";
                SqlCommand command = new SqlCommand(sql, connection);

                using (SqlDataReader dataReader = await command.ExecuteReaderAsync())
                {
                    while (await dataReader.ReadAsync())
                    {
                        Person person = new Person();
                        person.Id = Convert.ToInt32(dataReader[nameof(Person.Id)]);
                        person.Name = Convert.ToString(dataReader[nameof(Person.Name)]);
                        person.Address = Convert.ToString(dataReader[nameof(Person.Address)]);
                        person.Occupation = Convert.ToString(dataReader[nameof(Person.Occupation)]);
                        person.Birthday = Convert.ToDateTime(dataReader[nameof(Person.Birthday)]);
                        person.FavoriteColor = Convert.ToString(dataReader[nameof(Person.FavoriteColor)]);
                        person.Quote = Convert.ToString(dataReader[nameof(Person.Quote)]);
                        person.Message = Convert.ToString(dataReader[nameof(Person.Message)]);

                        yield return person;
                    }
                }

                await connection.CloseAsync();
            }
        }

I would like to know whether IAsyncEnumerable is not best suited for this kind of scenario or there was something wrong with how I queried the data while using IAsyncEnumerable? I might be wrong but I actually expect using IAsyncEnumerable would be faster. (by the way...the difference are usually in hundreds of milliseconds)

I tried the application with a sample data of 10,000 rows.

Here's also the code for populating the data just in case...

static async Task InsertDataAsync()
        {
            string connectionString = "Server=localhost; Database=AsyncStreams; Trusted_Connection = True;";
            using (SqlConnection connection = new SqlConnection(connectionString))
            {
                string sql = $"Insert Into Person (Name, Address, Birthday, Occupation, FavoriteColor, Quote, Message) Values";


                for (int i = 0; i < 1000; i++)
                {
                    sql += $"('{"Randel Ramirez " + i}', '{"Address " + i}', '{new DateTime(1989, 4, 26)}', '{"Software Engineer " + i}', '{"Red " + i}', '{"Quote " + i}', '{"Message " + i}'),";
                }

                using (SqlCommand command = new SqlCommand(sql.Remove(sql.Length - 1), connection))
                {
                    command.CommandType = CommandType.Text;

                    await connection.OpenAsync();
                    await command.ExecuteNonQueryAsync();
                    await connection.CloseAsync();
                }

            }
        }
Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
Randel Ramirez
  • 3,671
  • 20
  • 49
  • 63
  • 5
    That's not surprising. With `IAsyncEnumerable`, you're `awaiting` each person. With `Task`, you're awaiting just once. The advantage with `IAsyncEnumerable` is that you get to see each person as they're fetched: you don't have to wait for all the people to be fetched. If you don't need that, don't use `IAsyncEnumerable` – canton7 Jan 16 '20 at 16:27
  • 1
    @canton7 This is not entirely correct. Within LoadDataAsyncStream the code is awaiting for each call to ExecuteReaderAsync as well. – Fabian Bigler Jan 16 '20 at 16:36
  • 1
    @FabianBigler I was talking about consuming the `IAsyncEnumerable` / `Task`. The same number of awaits are needed to create it in both cases – canton7 Jan 16 '20 at 16:37
  • 2
    Actually, an `IAsyncEnumerable` implementation is allowed to "produce" batches of values making the `MoveNextAsync` synchronous for values already batched. – Paulo Morgado Jan 16 '20 at 17:49
  • Does the performance difference still holds if you comment out the line `Console.WriteLine($"Id: {person.Id}, Name: {person.Name}");`? My theory is that printing the data while fetching them from the database may slow down the asynchronous communication with the DB. – Theodor Zoulias Jan 17 '20 at 09:09
  • @TheodorZoulias I did I try without writing in the Console, and the performance still seems to favor the using Task – Randel Ramirez Jan 17 '20 at 14:44
  • How much is the difference? Does it make any difference if you run the new pattern first and the old pattern second? – Theodor Zoulias Jan 17 '20 at 15:41
  • @TheodorZoulias the difference is generally about 100ms, but running it multiple times has varied resuls, but yeah generally it seems to favor the former, the reason I actually posted the question was because I assumed IAsyncEnumerable will be faster than Task and I thought maybe I'm just doing it wrong which causes the results to favor the 'old pattern' – Randel Ramirez Jan 17 '20 at 18:33
  • Does this difference of 100ms scale when reading more records? Or it's constant? – Theodor Zoulias Jan 17 '20 at 20:02

2 Answers2

5

IAsyncEnumerable<T> is not inherently faster or slower than Task<T>. It depends on the implementation.

IAsyncEnumerable<T> is about asynchronously retrieving data providing individual values as soon as possible.

IAsyncEnumerable<T> allows batch producing values which will make some of the invocations of MoveNextAsync synchronous, as in the next example:

async Task Main()
{
    var hasValue = false;
    var asyncEnumerator = GetValuesAsync().GetAsyncEnumerator();
    do
    {
        var task = asyncEnumerator.MoveNextAsync();
        Console.WriteLine($"Completed synchronously: {task.IsCompleted}");
        hasValue = await task;
        if (hasValue)
        {
            Console.WriteLine($"Value={asyncEnumerator.Current}");
        }
    }
    while (hasValue);
    await asyncEnumerator.DisposeAsync();
}

async IAsyncEnumerable<int> GetValuesAsync()
{
    foreach (var batch in GetValuesBatch())
    {
        await Task.Delay(1000);
        foreach (var value in batch)
        {
            yield return value;
        }
    }
}
IEnumerable<IEnumerable<int>> GetValuesBatch()
{
    yield return Enumerable.Range(0, 3);
    yield return Enumerable.Range(3, 3);
    yield return Enumerable.Range(6, 3);
}

Output:

Completed synchronously: False
Value=0
Completed synchronously: True
Value=1
Completed synchronously: True
Value=2
Completed synchronously: False
Value=3
Completed synchronously: True
Value=4
Completed synchronously: True
Value=5
Completed synchronously: False
Value=6
Completed synchronously: True
Value=7
Completed synchronously: True
Value=8
Completed synchronously: True
Bizhan
  • 16,157
  • 9
  • 63
  • 101
Paulo Morgado
  • 14,111
  • 3
  • 31
  • 59
  • 1
    Attention: the method `IAsyncEnumerator.MoveNextAsync` returns a `ValueTask`, and doing anything with a `ValueTask` other than awaiting it (once) or calling its `AsTask` method is against the type's [contract](https://learn.microsoft.com/en-us/dotnet/api/system.threading.tasks.valuetask-1#remarks). This includes querying its `IsCompleted` property. – Theodor Zoulias Jan 17 '21 at 04:45
  • That's correct, @TheodorZoulias. But, in this particular case, the code is not holding on to the `ValueTask` beyond awaiting it or calling `MoveNextAsync` again. – Paulo Morgado Jan 17 '21 at 19:16
  • What about this line: `Console.WriteLine($"Completed synchronously: {task.IsCompleted}");`? – Theodor Zoulias Jan 17 '21 at 19:26
  • It«s alwyas called before the next call to `MoveNextAsync`. – Paulo Morgado Jan 17 '21 at 19:33
  • 1
    Now that I am thinking of it, the `IsCompleted` property should be safe to call. Otherwise it would not have a reason for its existence. But I am not 100% sure. – Theodor Zoulias Jan 17 '21 at 19:38
  • You just can't store a `ValueTask` and use it after the next call `MoveNextAsync` – Paulo Morgado Jan 18 '21 at 12:18
  • 2
    That's not the only restriction. Querying its `Result` property before its completion is not allowed, awaiting it twice is not allowed, calling `AsTask` twice is not allowed etc. – Theodor Zoulias Jan 18 '21 at 16:37
1

I think the answer to the question of "I would like to know whether IAsyncEnumerable is not best suited for this kind of scenario" got a bit lost in @Bizhan's example of batching and the ensuing discussion, but to reiterate from that post:

IAsyncEnumerable<T> is about asynchronously retrieving data providing individual values as soon as possible.

The OP is measuring the total time to read all records and ignoring how quickly the first record is retrieved and ready to be used by the calling code.

If "this kind of scenario" means reading all the data into memory as fast as possible, then IAsyncEnumerable is not best suited for that.

If it is important to start processing the initial records before waiting for all of the records to be read, that is what IAsyncEnumerable is best suited for.

However, in the real world, you should really be testing the performance of the total system, which would include actual processing of the data as opposed to simply outputting it to a console. Particularly in a multithreaded system, maximum performance could be gained by starting to process multiple records simultaneously as quickly as possible, while at the same time reading in more data from the database. Compare that to waiting for a single thread to read all of the data up front (assuming you could fit the entire dataset into memory) and only then being able to start processing it.

Harlow Burgess
  • 1,876
  • 14
  • 12
  • For another example, In asp.net 6, a controller can return an `IAsyncEnumerable` and the framework will stream the resulting json while fetching the results. Which can both improve the time to the first object, and improve the total time if the transfer is I/O bound. – Jeremy Lakeman Nov 25 '21 at 02:38