Skip to content

Chapter 7: Purely functional parallelism

Denys Rybalka edited this page Mar 6, 2024 · 8 revisions

Contents

  • Notes: Chapter notes and links to further reading related to the content in this chapter
  • FAQ: Questions related to the chapter content. Feel free to add questions and/or answers here related to the chapter.

Notes

FP has a long history of using combinator libraries for expressing parallelism, and there are a lot of variations of the general idea. The main design choices in this sort of library are around how explicit to make the forking and joining of parallel computations. That is, should the API force the programmer to be fully explicit about when parallel tasks are being forked off into a separate logical thread, or should this be done automatically? And similarly, when waiting for the results of multiple logical threads (say, for the implementation of map2), should the order of these joins be something the programmer explicitly specifies or chosen by the framework?

The library we developed in this chapter sits somewhere in the middle--it is explicit about where tasks are forked, but not when tasks are joined (notice that map2 picks the order it waits for the two tasks whose results it is combining). The join order can be made more explicit. Simon Marlow, one of the GHC Haskell developers, discusses this in Parallel programming in Haskell with explicit futures. Also see the full paper, Seq no more: Better Strategies for Parallel Haskell, which does a nice job of explaining some of the history of approaches for parallelism in Haskell.

Note that because Scala is a strict-by-default language, being more explicit about the join order isn't necessarily as beneficial as in Haskell. That is, we can get away with reasoning about join order much like we think about evaluation in normal strict function application.

This style of library isn't particularly good at expressing pipeline parallelism that's possible when transforming streams of data. For instance, if we have a Stream[Int] of 10000 items, and we wish to square each value, then compute a running sum of the squared values, there is a potential for parallelism--as we are squaring values, we can be passing the squared values off to another consumer that is emitting the running sum of the values it receives. We'll be discussing stream processing and pipeline parallelism more in part 4.

Notes about map fusion

We noted in this chapter that one of our laws for Par, sometimes called map fusion, can be used as an optimization:

map(map(y)(g))(f) == map(y)(f compose g)

That is, rather than spawning a separate parallel computation to compute the second mapping, we can fold it into the first mapping. We mentioned that our representation of Par doesn't allow for this, as it's too 'opaque'. If we make Par a proper data type and give it constructors that we can pattern match on, then it's easy to implement map fusion:

trait Par[+A] {
  def map[B](f: A => B): Par[B] = this match {
    case MapPar(p, g) => MapPar(p, g andThen f)
    case _ => MapPar(
  }
case class MapPar[A,+B](par: Par[A], f: A => B) extends Par[B]

Baking ad hoc optimization rules like this into our data type works, but it can sometimes get unwieldy, and it's not very modular (we don't get to reuse code if there's some other data type needing similar optimizations). There are various ways of factoring out these sorts of optimizations so our core data type (be it Par or some other type) stays clean, and the optimization is handled as a separate concern. Edward Kmett has a nice blog series discussing this approach. Before embarking on that series you'll need to be familiar with the content in part 3 of this book, and you should read the Haskell appendix as well.

FAQ

The proof of map fusion seems wrong (Answer 7.7)

The first step in the proof is substituting the identity function instead of g, which shrinks the statement to be proved and is not allowed. The error is not as obvious because two substitutions are made at once in that step, but it is still there. These kinds of substitutions are only allowed if one starts from given laws because a shrunk given is still true, while a proven shrunk statement cannot be generalized to the original one.

To illustrate the point on a simple example, using the same erroneous considerations I can prove that given x * 0 == 0 for any x, we can get x * 2 == x * 3 as a free theorem:

x * 2 == x * 3 // Start with statement to be proved
0 * 2 == 0 * 3 // Substitute 0 instead of x
0 == 0 // Simplify using given law -> both sides equal, hurra