It seems that it could be automatically detected that a and b can be evaluated in parallel
Parallelism can be detected automatically, as you hint at, by looking at the dependencies between values. This is particuarly easy when there are no side-effects involved.
The problem is that is knowing when to stop making things parallel.
It all boils down to knowing at compile time how much work will occur at runtime. These "cost models" are hard to do in general for arbitrary code.
Consider:
- should every argument to
(+)
be evaluated in parallel?
- should every map be evaluated in parallel?
If we naively parallelize all independent computations, the compiler will generate a huge amount of parallel tasks. Millions or billions of parallel expressions. Which our 8 or 16 core machines are just not ready to handle.
Naive parallelization results in massive overheads trying to schedule work onto the small amount of available parallel hardware.
This gap between the amount of parallelism in pure programs, and the available hardware, forces us to make some compromises. Namely:
- user-annotated hints of which things are costly enough to do in
parallel
- subsets of the language that have a clear cost model, so the compiler can be smart.
Examples of the first form -- user hints -- are par
annotations or the Par
monad.
Of the second -- automatically parallel sub-languages -- see Data Parallel Haskell.