2

In functional programming a statement like if (a || b) can be evaluated in parallel by default, since we don't make any assumptions about the evaluation order due to referential transparency. Is there any easy way to introduce this type of functional parallelism for conditionals in C#?

For instance in C# we can introduce parallelism over for loops easily using Parallel.For(), so I'm wondering if there's something similar to conditionals?

I'm trying to bring some of the benefits from functional programming to the way I write C#.

Mark Seemann
  • 225,310
  • 48
  • 427
  • 736
user4779
  • 645
  • 5
  • 14
  • 2
    _"For instance in C# we can introduce paralleism over `for` loops easily using `Parallel.For()`"_ - that's oversimplifying things to the point of being misleading considering that _there's a lot more going on_ when you use `Parallel.For` than just saying it's "a parallel `for`-loop". – Dai Mar 14 '23 at 03:53
  • 1
    _"so I'm wondering if there's something similar to conditionals?"_ - not at the language-level because that would be silly: parallelism/concurrency is not something you can arbitrarily apply to any language-construct - and because C# is _not_ an FP-first language (it doesn't even support compiler-aware `pure` functions) it would be a folly to try to implement it. C# is not Haskell, C# is... C# (or at least, _a better Java_), so don't try to force it into being a language it isn't. – Dai Mar 14 '23 at 03:56
  • 1
    @Dai Just curious how it doesn't relate to referential transparency? If I have bool a = b || c, the evaluation order there shouldn't matter according to referential transparency right? – user4779 Mar 14 '23 at 03:57
  • I misread your opening paragraph and I retracted my comments re: those. – Dai Mar 14 '23 at 04:08
  • 1
    @Dai Not a problem, if it's not possible then that's all I need to know. But I still can't understand why the compiler couldn't optimize that with some annotation or extension. Nothing is to stop me from spinning up two tasks or threads in parallel to evaluate each side separately and doing a .WaitAll to replicate that behaviour right? Assuming of course both operations contain no side effects – user4779 Mar 14 '23 at 04:13
  • 1
    FYI https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/operators/boolean-logical-operators#logical-or-operator- – Jeremy Lakeman Mar 14 '23 at 04:17
  • 3
    @user4779 Well, _that's the thing_: **concurrency is expensive**. Using a `Parallel.For` will always be substantially slower than a `for` loop unless `n` is sufficiently large enough such that the overheads of thread-synchronization, locking, cache invalidation, etc are cancelled-out by the performance gains of 2 or more threads. – Dai Mar 14 '23 at 04:20
  • 1
    @user4779 Also, **never** use `.WaitAll` - that's how you get a deadlock (but had you said `await Task.WhenAll`, that would be okay - but take a look at the actual IL generated whenever you use `await`: it's _very_ gnarly). – Dai Mar 14 '23 at 04:21

2 Answers2

4

While possible in theory, I haven't heard about any language that supports this with syntax or simple APIs (but I also only know a handful of languages). As you may know, Haskell is often considered the gold standard of FP, but Boolean expressions in Haskell are just normal, short-circuiting expressions, just like they are in C#.

ghci> a = True
ghci> b = undefined
ghci> (a || b)
True

Here, b is undefined which means that actually trying to evaluate it will throw an exception:

ghci> b
*** Exception: Prelude.undefined
CallStack (from HasCallStack):
  error, called at libraries\base\GHC\Err.hs:74:14 in base:GHC.Err
  undefined, called at <interactive>:2:5 in interactive:Ghci2

But as you can see in the top example, (a || b) evaluates to True because it short-circuits.

In fact, Haskell is, in general, automatically short-circuiting because of lazy evaluation, but most languages have short-circuiting Boolean expressions.

That's already a good optimization. If you have one cheap and one expensive Boolean expression, put the cheap one first.

So how often do you need more optimization than that? Particularly in the context of referentially transparent functions. Usually, expensive operations are the ones involving I/O.

Not that having two (or more) expensive, but referentially transparent Boolean expressions, is entirely inconceivable. If you're working in the realms of cryptography, protein folding, or the like, I suppose you could have two Boolean expressions, a and b, that are both referentially transparent and expensive to compute.

While not unconceivable, it seems like a sufficiently niche problem that I wouldn't expect a mainstream language to have that as a feature built into the language itself.

As Dai writes in a comment, concurrency comes with an overhead, so the expressions need to be more expensive than the overhead before such an optimization is warranted.

Can you write an API that gets you there? Yes, absolutely, and I'm sure other people here will tell you how to use Task.WhenAll and Task.WhenAny for that.

If you want hints on how to design such an API, it may help considering that Boolean comparisons form monoids, and that all monoids can be made lazy:

Lazy<bool> x = // ...
Lazy<bool> y = // ...
Lazy<bool> b = x.And(y);

You could do something similar with the TPL, since you can lift any monoid into an applicative functor pointwise:

(Applicative f, Monoid a) => Monoid (Ap f a)

Since C# Tasks are monads, they are also applicative functors, so we know from Haskell that this is possible, and how the API ought to look.

Mark Seemann
  • 225,310
  • 48
  • 427
  • 736
  • Interesting info thanks! Actually it was whilst using separate threads finding a nonce to solve a SHA256 crypto puzzle that the thought entered my mind. Interesting to hear that Haskell short circuits, didn't know that. I thought that sort of default parallelism from referential transparency was one of the major benefits of FP, but I can see given the lack of use cases in a conditional as you pointed out why it might not be there. – user4779 Mar 14 '23 at 09:17
  • @user4779 Again, as *Dai* wrote in a comment, concurrency comes with overhead. It's not free, so is not something that's just 'always on', not even in Haskell. Typically, if you want do multiple things in a truly fire-and-forget way, each task may only incur the cost of creating a new thread, but if return values (even Booleans) are involved, even immutable code has to somehow synchronise and marshal return values across threads, and that stuff tends to be more expensive than you'd think. Perhaps not human-perceptibly expensive, but more than just doing things on the same thread. – Mark Seemann Mar 14 '23 at 11:04
  • @MarkSeemann I thought of a possible scenario where it might-work the way the OP wants: supposing you’re writing a function on a SIMD processor and both left and right operands are the same function/expression but operating on different arguments - couldn’t the compiler use SIMD to evaluate both operands simultaneously - with only a teeensy bit of housekeeping at the end? (Though it is a contrived scenario, I admit) – Dai Mar 14 '23 at 11:12
  • @Dai I don't know what a SIMD processor is, but my general point here is *not* that this is never useful, but rather that it probably is less useful than the OP thought. – Mark Seemann Mar 14 '23 at 11:36
2

I created the AnyConditionIsMetAsync method for your usecase.

It accepts any number of condition tasks/async methods/sync methods and waits until either any of them finish executing and return true or until all of them finish (if none return true).

Here's the usage example program code, and the method's code alongisde some flexibility signatures/overrides:

using System.Diagnostics;

namespace ParallelConditioning
{
    public class Program
    {
        public static async Task Main(string[] args)
        {
            Stopwatch stopwatch = Stopwatch.StartNew();
            if (await AnyConditionIsMetAsync(Condition1Async(), Condition2Async(), Condition3Async(), Condition4Async()))
            {
                // #1:
                // We start the execution of all condition async methods in parallel and pass their tasks to the AnyConditionIsMetAsync method
                // which waits until any task finishes with a true result
                Console.WriteLine("1 || 2 || 3 || 4: " + stopwatch.Elapsed);
            }

            stopwatch.Restart();
            if (await AnyConditionIsMetAsync(Condition4Async, Condition3Async, Condition2Async, Condition1Async))
            {
                // #2:
                // equivalent to #1,
                // except we're passing the async methods so that the condition tasks are started by the AnyConditionIsMet method
                Console.WriteLine("4 || 3 || 2 || 1: " + stopwatch.Elapsed);
            }

            stopwatch.Restart();
            if (await AnyConditionIsMetAsync(Condition1, Condition3, Condition4, Condition2))
            {
                // #3: passing sync functions that will be run in parallel
                Console.WriteLine("1 || 3 || 4 || 2: " + stopwatch.Elapsed);
            }

            stopwatch.Stop();
        }

        private static async Task<bool> Condition1Async()
        {
            await Task.Delay(1000);
            return false;
        }

        private static async Task<bool> Condition2Async()
        {
            await Task.Delay(2000);
            return true;
        }

        private static async Task<bool> Condition3Async()
        {
            await Task.Delay(3000);
            return false;
        }

        private static async Task<bool> Condition4Async()
        {
            await Task.Delay(4000);
            return true;
        }

        private static bool Condition1() => Condition1Async().Result;
        private static bool Condition2() => Condition2Async().Result;
        private static bool Condition3() => Condition3Async().Result;
        private static bool Condition4() => Condition4Async().Result;


        #region AnyConditionIsMetAsync
        private static async Task<bool> AnyConditionIsMetAsync(IEnumerable<Task<bool>> conditionTasks)
        {
            HashSet<Task<bool>> conditionTaskSet = new(conditionTasks);
            while (conditionTaskSet.Count > 0)
            {
                Task<bool> completedConditionTask = await Task.WhenAny(conditionTaskSet);
                if (completedConditionTask.Result)
                {
                    return true;
                }
                conditionTaskSet.Remove(completedConditionTask);
            }
            return false;
        }
        private static Task<bool> AnyConditionIsMetAsync(Task<bool> conditionTask, params Task<bool>[] additionalConditionTasks)
            => AnyConditionIsMetAsync(additionalConditionTasks.Prepend(conditionTask));

        private static Task<bool> AnyConditionIsMetAsync(IEnumerable<Func<Task<bool>>> conditionAsyncFunctions)
            => AnyConditionIsMetAsync(conditionAsyncFunctions.Select(Task.Run));
        private static Task<bool> AnyConditionIsMetAsync(Func<Task<bool>> conditionAsyncFunction, params Func<Task<bool>>[] additionalConditionAsyncFunctions)
            => AnyConditionIsMetAsync(additionalConditionAsyncFunctions.Prepend(conditionAsyncFunction));

        private static Task<bool> AnyConditionIsMetAsync(IEnumerable<Func<bool>> conditionFunctions)
            => AnyConditionIsMetAsync(conditionFunctions.Select(Task.Run));
        private static Task<bool> AnyConditionIsMetAsync(Func<bool> conditionFunction, params Func<bool>[] additionalConditionFunctions)
            => AnyConditionIsMetAsync(additionalConditionFunctions.Prepend(conditionFunction));


        #endregion
    }
}
ErroneousFatality
  • 450
  • 1
  • 5
  • 13