The main reason is that referential transparency (and even more so laziness) abstracts over the execution order. This makes it trivial to parallelize evaluation.
For example, if both a
, b
, and ||
are referentially transparent, then it doesn't matter if in
a || b
a
gets evaluated first, b
gets evaluated first, or b
doesn't get evaluated at all (because a
was evaluated to true
).
In
a || a
it doesn't matter if a
gets evaluated once or twice (or, heck, even 5 times … which wouldn't make sense, but doesn't matter nonetheless).
So, if it doesn't matter in which order they are evaluated and it doesn't matter whether they are evaluated needlessly, then you can simply evaluate every sub-expression in parallel. So, we could evaluate a
and b
in parallel, and then, ||
could wait for either one of the two threads to finish, look what it returned, and if it returned true
, it could even cancel the other one and immediately return true
.
Every sub-expression can be evaluated in parallel. Trivially.
Note, however, that this is not a magic bullet. Some experimental early versions of GHC did this, and it was a disaster: there was just too much potential parallelism. Even a simple program could spawn hundreds, thousands, millions of threads and for the overwhelming majority of sub-expressions spawning the thread takes much longer than just evaluating the expression in the first place. With so many threads, context switching time completely dominates any useful computation.
You could say that functional programming turns the problem on its head: typically, the problem is how to break apart a serial program into just the right size of parallel "chunks", whereas with functional programming, the problem is how to group together parallel sub-programs into serial "chunks".
The way GHC does it today is that you can manually annotate two subexpressions to be evaluated in parallel. This is actually similar to how you would do it in an imperative language as well, by putting the two expressions into separate threads. But there is an important difference: adding this annotation can never change the result of the program! It can make it faster, it can make it slower, it can make it use more memory, but it cannot change its result. This makes it way easier to experiment with parallelism to find just the right amount of parallelism and the right size of chunks.