0

I want to use IAsyncEnumerable like a Source for akka streams. But I not found, how do it.

No sutiable method in Source class for this code.

using System.Collections.Generic;
using System.Threading.Tasks;
using Akka.Streams.Dsl;

namespace ConsoleApp1
{
    class Program
    {
        static async Task Main(string[] args)
        { 
            Source.From(await AsyncEnumerable())
                .Via(/*some action*/)
                //.....
        }

        private static async IAsyncEnumerable<int> AsyncEnumerable()
        {
            //some async enumerable
        }
    }
}

How use IAsyncEnumerbale for Source?

Levi Ramsey
  • 18,884
  • 1
  • 16
  • 30
Dmitry
  • 1
  • 1

3 Answers3

3

This has been done in the past as a part of Akka.NET Streams contrib package, but since I don't see it there anymore, let's go through on how to implement such source. The topic can be quite long, as:

  1. Akka.NET Streams is really about graph processing - we're talking about many-inputs/many-outputs configurations (in Akka.NET they're called inlets and outlets) with support for cycles in graphs.
  2. Akka.NET is not build on top of .NET async/await or even on top of .NET standard thread pool library - they're both pluggable, which means that the lowest barier is basically using callbacks and encoding what C# compiler sometimes does for us.
  3. Akka.NET streams is capable of both pushing and pulling values between stages/operators. IAsyncEnumerable<T> can only pull data while IObservable<T> can only push it, so we get more expressive power here, but this comes at a cost.

The basics of low level API used to implement custom stages can be found in the docs.

The starter boilerplate looks like this:

public static class AsyncEnumerableExtensions {
    // Helper method to change IAsyncEnumerable into Akka.NET Source.
    public static Source<T, NotUsed> AsSource<T>(this IAsyncEnumerable<T> source) => 
        Source.FromGraph(new AsyncEnumerableSource<T>(source));
}

// Source stage is description of a part of the graph that doesn't consume
// any data, only produce it using a single output channel.
public sealed class AsyncEnumerableSource<T> : GraphStage<SourceShape<T>>
{
    private readonly IAsyncEnumerable<T> _enumerable;

    public AsyncEnumerableSource(IAsyncEnumerable<T> enumerable)
    {
        _enumerable = enumerable;
        Outlet = new Outlet<T>("asyncenumerable.out");
        Shape = new SourceShape<T>(Outlet);
    }

    public Outlet<T> Outlet { get; }
    public override SourceShape<T> Shape { get; }

    /// Logic if to a graph stage, what enumerator is to enumerable.
    protected override GraphStageLogic CreateLogic(Attributes inheritedAttributes) => 
        new Logic(this);

    sealed class Logic: OutGraphStageLogic
    {
        public override void OnPull()
        {
            // method called whenever a consumer asks for new data
        }

        public override void OnDownstreamFinish() 
        {
            // method called whenever a consumer stage finishes,used for disposals
        }
    }
}

As mentioned, we don't use async/await straight away here: even more, calling Logic methods in asynchronous context is unsafe. To make it safe we need to register out methods that may be called from other threads using GetAsyncCallback<T> and call them via returned wrappers. This will ensure, that not data races will happen when executing asynchronous code.

sealed class Logic : OutGraphStageLogic
{
    private readonly Outlet<T> _outlet;
    // enumerator we'll call for MoveNextAsync, and eventually dispose
    private readonly IAsyncEnumerator<T> _enumerator;
    // callback called whenever _enumerator.MoveNextAsync completes asynchronously
    private readonly Action<Task<bool>> _onMoveNext;
    // callback called whenever _enumerator.DisposeAsync completes asynchronously
    private readonly Action<Task> _onDisposed;
    // cache used for errors thrown by _enumerator.MoveNextAsync, that
    // should be rethrown after _enumerator.DisposeAsync
    private Exception? _failReason = null;
    
    public Logic(AsyncEnumerableSource<T> source) : base(source.Shape)
    {
        _outlet = source.Outlet;
        _enumerator = source._enumerable.GetAsyncEnumerator();
        _onMoveNext = GetAsyncCallback<Task<bool>>(OnMoveNext);
        _onDisposed = GetAsyncCallback<Task>(OnDisposed);
    }

    // ... other methods
}

The last part to do are methods overriden on `Logic:

  • OnPull used whenever the downstream stage calls for new data. Here we need to call for next element of async enumerator sequence.
  • OnDownstreamFinish called whenever the downstream stage has finished and will not ask for any new data. It's the place for us to dispose our enumerator.

Thing is these methods are not async/await, while their enumerator's equivalent are. What we basically need to do there is to:

  1. Call corresponding async methods of underlying enumerator (OnPullMoveNextAsync and OnDownstreamFinishDisposeAsync).
  2. See, if we can take their results immediately - it's important part that usually is done for us as part of C# compiler in async/await calls.
  3. If not, and we need to wait for the results - call ContinueWith to register our callback wrappers to be called once async methods are done.
sealed class Logic : OutGraphStageLogic
{
    // ... constructor and fields

    public override void OnPull()
    {
        var hasNext = _enumerator.MoveNextAsync();
        if (hasNext.IsCompletedSuccessfully)
        {
            // first try short-path: values is returned immediately
            if (hasNext.Result)
                // check if there was next value and push it downstream
                Push(_outlet, _enumerator.Current);
            else
                // if there was none, we reached end of async enumerable
                // and we can dispose it
                DisposeAndComplete();
        }
        else
            // we need to wait for the result
            hasNext.AsTask().ContinueWith(_onMoveNext);
    }

    // This method is called when another stage downstream has been completed
    public override void OnDownstreamFinish() =>
        // dispose enumerator on downstream finish
        DisposeAndComplete();
    
    private void DisposeAndComplete()
    {
        var disposed = _enumerator.DisposeAsync();
        if (disposed.IsCompletedSuccessfully)
        {
            // enumerator disposal completed immediately
            if (_failReason is not null)
                // if we close this stream in result of error in MoveNextAsync,
                // fail the stage
                FailStage(_failReason);
            else
                // we can close the stage with no issues
                CompleteStage();
        }
        else 
            // we need to await for enumerator to be disposed
            disposed.AsTask().ContinueWith(_onDisposed);
    }

    private void OnMoveNext(Task<bool> task)
    {
        // since this is callback, it will always be completed, we just need
        // to check for exceptions
        if (task.IsCompletedSuccessfully)
        {
            if (task.Result)
                // if task returns true, it means we read a value
                Push(_outlet, _enumerator.Current);
            else
                // otherwise there are no more values to read and we can close the source
                DisposeAndComplete();
        }
        else
        {
            // task either failed or has been cancelled
            _failReason = task.Exception as Exception ?? new TaskCanceledException(task);
            FailStage(_failReason);
        }
    }

    private void OnDisposed(Task task)
    {
        if (task.IsCompletedSuccessfully) CompleteStage();
        else {
            var reason = task.Exception as Exception 
                ?? _failReason 
                ?? new TaskCanceledException(task);
            FailStage(reason);
        }
    }
}

Bartosz Sypytkowski
  • 7,463
  • 19
  • 36
  • This worked for me. I have an actor that is a gRPC client to a bi-directional stream. Needed to get the IAsyncEnumerable into a source and then iterate the stream with "source.RunForeach". Using the extension method above like this: "var source = _duplexStream.ResponseStream.ReadAllAsync().AsSource();". Did need to add this line of code to the constructor of the "Logic" class. "SetHandler(source.Outlet, OnPull);". Thanks @Bartosz – RussellEast May 10 '22 at 14:27
3

As of Akka.NET v1.4.30 this is now natively supported inside Akka.Streams via the RunAsAsyncEnumerable method:

var input = Enumerable.Range(1, 6).ToList();

var cts = new CancellationTokenSource();
var token = cts.Token;

var asyncEnumerable = Source.From(input).RunAsAsyncEnumerable(Materializer);
var output = input.ToArray();
bool caught = false;
try
{
    await foreach (var a in asyncEnumerable.WithCancellation(token))
    {
        cts.Cancel();
    }
}
catch (OperationCanceledException e)
{
    caught = true;
}

caught.ShouldBeTrue();

I copied that sample from the Akka.NET test suite, in case you're wondering.

Aaronontheweb
  • 8,224
  • 6
  • 32
  • 61
2

You can also use an existing primitive for streaming large collections of data. Here is an example of using Source.unfoldAsync to stream pages of data - in this case github repositories using Octokit - until there is no more.

var source = Source.UnfoldAsync<int, RepositoryPage>(startPage, page =>
{
    var pageTask = client.GetRepositoriesAsync(page, pageSize);    
    var next = pageTask.ContinueWith(task =>
    {
        var page = task.Result;
        if (page.PageNumber * pageSize > page.Total) return Option<(int, RepositoryPage)>.None;
        else return new Option<(int, RepositoryPage)>((page.PageNumber + 1, page));
    });

    return next;
}); 

To run

using var sys = ActorSystem.Create("system");
using var mat = sys.Materializer();

int startPage = 1;
int pageSize = 50;

var client = new GitHubClient(new ProductHeaderValue("github-search-app"));

var source = ...

var sink = Sink.ForEach<RepositoryPage>(Console.WriteLine);

var result = source.RunWith(sink, mat);
await result.ContinueWith(_ => sys.Terminate());
class Page<T>
{
    public Page(IReadOnlyList<T> contents, int page, long total)
    {        
        Contents = contents;
        PageNumber = page;
        Total = total;
    }
    
    public IReadOnlyList<T> Contents { get; set; } = new List<T>();
    public int PageNumber { get; set; }
    public long Total { get; set; }
}

class RepositoryPage : Page<Repository>
{
    public RepositoryPage(IReadOnlyList<Repository> contents, int page, long total) 
        : base(contents, page, total)
    {
    }

    public override string ToString() => 
        $"Page {PageNumber}\n{string.Join("", Contents.Select(x => x.Name + "\n"))}";
}

static class GitHubClientExtensions
{
    public static async Task<RepositoryPage> GetRepositoriesAsync(this GitHubClient client, int page, int size)
    {
        // specify a search term here
        var request = new SearchRepositoriesRequest("bootstrap")
        {
            Page = page,
            PerPage = size
        };

        var result = await client.Search.SearchRepo(request);
        return new RepositoryPage(result.Items, page, result.TotalCount);        
    }
}
tstojecki
  • 1,480
  • 2
  • 19
  • 29