40

Is there some substitute of map which evaluates the list in parallel? I don't need it to be lazy.

Something like: pmap :: (a -> b) -> [a] -> [b] letting me pmap expensive_function big_list and have all my cores at 100%.

Don Stewart
  • 137,316
  • 36
  • 365
  • 468
Clark Gaebel
  • 17,280
  • 20
  • 66
  • 93

2 Answers2

46

Yes, see the parallel package:

ls `using` parList rdeepseq

will evaluate each element of the list in parallel via the rdeepseq strategy. Note the use of parListChunk with a good chunk value might give better performance if your elements are too cheap to get a benefit evaluating each one in parallel (because it saves on sparking for each element).

EDIT: Based on your question I feel I should explain why this is an answer. It's because Haskell is lazy! Consider the statement

let bs = map expensiveFunction as

Nothing has been evaluated. You've just created a thunk that maps expensiveFunction. So how do we evaluate it in parallel?

let bs = map expensiveFunction as
    cs = bs `using` parList rdeepseq

Now don't use the bs list in your future computations, instead use the cs list. IOW, you don't need a parallel map, you can use the regular (lazy) maps and a parallel evaulation strategy.

EDIT: And if you look around enough you'll see the parMap function that does what I showed here but wrapped into one helper function.

In response to your comment, does the below code not work for you? it works for me.

import Control.Parallel.Strategies

func as =
        let bs = map (+1) as
            cs = bs `using` parList rdeepseq
        in cs
Thomas M. DuBuisson
  • 64,245
  • 7
  • 109
  • 166
  • I tried doing `pmap f x = (map f x) \`using\` parList rdeepseq`, but GHC is complaining that rdeepseq needs an argument. – Clark Gaebel Apr 09 '11 at 16:41
  • @clark see the code I pasted - this should load into GHCi fine. Does it work for you? The expression `parMap rdeepseq f as` should do the same thing. – Thomas M. DuBuisson Apr 09 '11 at 16:44
  • 1
    Doesn't work for me. "No instance for (Control.DeepSeq.NFData b) arising from a use of `rdeepseq'" – Clark Gaebel Apr 09 '11 at 16:53
  • 1
    @clark you must be using it in a particular context or with an explicit type signature. Be sure the elements of your list have an `NFData` instance - that is required for use of `rdeepseq`. If that is too onerous then use `rseq` instead, which evalutes to whnf. – Thomas M. DuBuisson Apr 09 '11 at 17:08
  • Okay. I did it with parMap, but I'm still only using one core here =/ – Clark Gaebel Apr 09 '11 at 18:18
  • 3
    @clark Did you compile with threaded (`ghc -O2 -threaded blah.hs --make`) and use the right RTS options (`./blah +RTS -Nx`) where `x` is the number of cores you want to use, such as `2`? Note on GHC 7 you should just be able to type `ghc -O2 -threaded -with-rtsopts=-N blah.hs` and run `./blah`. – Thomas M. DuBuisson Apr 09 '11 at 18:47
  • @TomMD: Yes I did. Does it matter than I'm on Windows? – Clark Gaebel Apr 10 '11 at 18:45
  • @clark No, Windows should work fine. Could you edit your question with the exact code you compiled, the compilation line (also using `-fforce-recomp`, just to be sure) and how you executed it? Also, find me on #haskell in IRC if you want to work this out real-time. – Thomas M. DuBuisson Apr 10 '11 at 19:30
  • Does this also work when working with datastructures you defined yourself in the list you map over? I tried this approach but I get `No instance for (NFData myDataType)` error – Astarno Dec 27 '19 at 17:52
  • It works if the type has an instance for NFData, as you observed you need. – Thomas M. DuBuisson Dec 27 '19 at 22:11
22

Besides using explicit strategies yourself as Tom has described, the parallel package also exports parMap:

 parMap :: Strategy b -> (a -> b) -> [a] -> [b]

where the strategy argument is something like rdeepseq.

And there's also parMap in the par-monad package (you step out of pure Haskell, and into a parallel monad):

 parMap :: NFData b => (a -> b) -> [a] -> Par [b]

The par-monad package is documented here.

Don Stewart
  • 137,316
  • 36
  • 365
  • 468
  • 3
    There is a small caveat here. parMap is using mapM, which is strict. This means that the list spine is fully evaluated before computation starts - if the list is long, e.g. you are parMap'ping over records read from a (huge) file, this is probably not what you want. Perhaps this would work better with a lazy parMap, or by distributing elements round-robin. – Ketil Feb 22 '14 at 07:19