Repa Without Parallelization

Question

I really like Repa's interface, even regardless of its concurrency capabilities. And I actually need repa's arrays to be sequential, since my arrays are relatively small ones and parallelization on them is useless, even harmful.

However I do use parallelization in my program with parallel-io, so I compile it -threaded and run with +RTS -Nx. And this enables parallelization for repa. Is there a way to turn off repa's concurrency features?

Hm, while writing this I understood that it is unlikely I will need anything other then DIM1, so maybe I should switch to Vector. But nevertheless the answer to the question will be useful.

The warning message I get with parallel run is

Data.Array.Repa: Performing nested parallel computation sequentially.
  You've probably called the 'force' function while another instance was
  already running. This can happen if the second version was suspended due
  to lazy evaluation. Use 'deepSeqArray' to ensure that each array is fully
  evaluated before you 'force' the next one.

I actually have no force in my code.

It is an interesting question, but are you *sure* that allowing repa to parallelize is hurting performance? Have you profiled it? — Dan Burton, Dec 28 '11 at 08:54
@DanBurton at least with N4 it runs ~3 times longer than without `+RTS Nx`. Though it might be due to the output of warning messages. I'll put the warning message to the post — Yrogirg, Dec 28 '11 at 11:44
I would be incredibly surprised if repa parallelised operations on arrays too small to benefit from them. I would bet heavily on the warning message output causing the slowdown. — ehird, Dec 28 '11 at 14:58
@ehird I've run the program without parallel-io parallelization i.e. only with repa's '-N4' and it is ~ 30 % slower than the version with no parallelization whatsoever. That's just the nature of the program. — Yrogirg, Dec 28 '11 at 15:58
I believe the goal of repa is to "just work"; there shouldn't be any slowdown no matter the nature of the program. I emailed Ben Lippmeier (listed as the maintainer of [the repa package](http://hackage.haskell.org/package/repa)) and the repa bug tracker email; hopefully we can draw some attention from people who *really* know what's going on with repa. — Dan Burton, Dec 28 '11 at 17:40
Actually I meant the slowdown of mapping relatively simple functions over small arrays is not due to repa itself but due to hardware design. I thought there is a certain cost of consolidating back results obtained on several CPU cores. And if this cost is comparable to the cost of computation itself then there is no use of parallelization. GPU are better in this sense --- slower cores, but they are much better 'harmonized'. — Yrogirg, Dec 28 '11 at 18:37

score 3 · Accepted Answer · answered Jan 06 '12 at 09:00

Use the development version of Repa 3 from http://code.ouroborus.net/repa/repa-head. It has a version of "force" (how called computeS) that will evaluate the array sequentially.

Repa does not automatically sequentialise operations on small arrays. With (map f xs) the runtime depends as much on what 'f' is doing as the size of 'xs'. Repa does not attempt to work out what the 'f' is doing (that would be hard), so it doesn't know how expensive the computation will be.

Repa Without Parallelization

1 Answers1