Scala compiler: detecing a pure/impure function

Question

In FP languages like Scala, Haskell etc. pure functions are used which makes it possible for compiler to optimize the code. For eg:

val x = method1()// a pure function call
val y = method2// another pure function call
val c = method3(x,y)

As method1 and method2 are pure functions and hence evaluations are independent of each other, compiler can parallelize both the calls.

Language like Haskell has constructs within it (like IO monad) which tells whether function is pure or performs some IO operation. But how does Scala compiler detect that a function is pure function?

Scala does not classify code into pure or impure. From the compiler's point of view everything is impure. Also, I'm not aware of any language that performs automatic parallelization based just on the idea of purity; even in Haskell you need to be explicit about parallelism. — Ionuț G. Stan, Aug 21 '21 at 09:03
I found it here https://www.defmacro.org/2006/06/19/fp.html, and https://en.wikipedia.org/wiki/Functional_programming#Pure_functions — Mandroid, Aug 21 '21 at 10:19
For most pure functions the parallelization cost would be bigger than the advantage of running them in parallel. You are better controlling that kind of stuff. Anyways, projects like **cats-effect** or **ZIO** provide `IO` monads that you can use to ensure the purity of your code and running code concurrently. — Luis Miguel Mejía Suárez, Aug 21 '21 at 13:05

score 7 · Accepted Answer · answered Aug 21 '21 at 15:05

The general approach to classifying a block of code as pure is to define which operations are pure and since purity composes, a composition of pure operations is pure.

Parallelization isn't actually one of the more important benefits of pure code: the benefit is that any evaluation strategy can be used. Evaluation can be reordered or results can be cached etc. Parallelization is another evaluation strategy but without a good sense of the actual execution cost (and note that modern CPUs and memory hierarchies can make it really difficult to get such a sense), it often slows things down relative to other strategies. For modern pure code, laziness and caching repeated values is often more generally effective, while parallelism is controlled by the developer (one benefit of pure code is that you can make arbitrary changes to how you're parallelizing without changing the semantics of the code).

In the case of Scala, the compiler makes no real effort to classify pure/impure code and generally doesn't try alternative evaluation strategies: control of that is left to the programmer (the language helps somewhat by having call-by-name and lazy).

The JVM's JIT compiler can and does perform some purity analysis on bytecode when deciding what it can safely inline and reorder. This isn't Scala-specific, though final local variables (aka local vals in Scala or final variables in Java) enable some optimizations that can't otherwise be performed. Javascript runtimes (for ScalaJS) can (and really aggressively do, in practice) likewise perform that analysis, as does LLVM (for Scala Native).

This is largely speculation on my part, but because there is this common sense that "parallelizing things is the GO FASTER button", I suspect that advocates for writing pure code emphasize the ease of parallelizing pure code purely as part of the sales pitch. — Levi Ramsey, Aug 21 '21 at 15:08

score 2 · Answer 2 · answered Aug 22 '21 at 09:36

In the general case, Purity Analysis is equivalent to solving the Halting Problem. In other words, it is impossible to statically decide, in the general case, whether a chunk of code is pure or not.

In a language like Haskell, there is no way of writing impure code in Haskell, therefore purity analysis is trivial. Here is a simple function that takes a Haskell program as an argument and tells you whether it is true or not:

isPureProgram :: a -> Bool
isPureProgram _ = True

Note, I am simplifying a couple of things here:

unsafePerformIO and friends allow you to, well, perform unsafe I/O. It is generally assumed that you know what you are doing when you use these functions.
Exceptions are side-effects.
Contrary to popular belief, the IO monad does not allow you to write impure code in Haskell. What the IO monad does is to write a pure program which returns a list of IO actions, which when interpreted by the runtime system result in impure computation. However, the Haskell program which generates these IO actions is still pure – it is the interpreter which is impure. But of course, the end result will be the same: an impure computation will be performed.

However, since Scala is an impure language at its core, the compiler cannot rely on similar restrictions as a Haskell compiler can, and thus cannot perform purity analysis in the general case.

Scala compiler: detecing a pure/impure function

2 Answers2