2

I've written a program to process a large amount of data samples using Repa. Performance is key for this program. A large part of the operations require parallel maps/folds over a multi-dimensional arrays and Repa is perfect for this. However, there is still a part of my program that only uses one-dimensional arrays and doesn't require parallelism (i.e. overhead of parallelism would harm performance). Some of these operations require functions like take or folds with custom accumulators, which Repa doesn't support. So I'm writing these operations myself by iterating over the Repa array.

Am I better off re-writing these operations by using Vector instead of Repa? Would they result in better performance?

I've read somewhere that one-dimensional Repa arrays are implemented as Vectors 'under the hood' so I doubt that Vectors result in better performance. On the other hand, Vector does have some nice built-in functions that I could use instead of writing them myself.

Thomas Vanhelden
  • 879
  • 8
  • 20
  • Repa uses Unboxed(`U`) or Storable(`F`) Vector for all dimensions under the hood, not just `DIM1`. So, I'd say, if you don't need to process those `DIM1` arrays in parallel, you are better of using `vector` package directly. – lehins Jan 25 '17 at 13:23
  • Yes, you're right. Every N-dimensional Array is a Vector under the hood. Its only the shape that changes. – Thomas Vanhelden Jan 25 '17 at 16:56

1 Answers1

2

I've implemented some parts of my program with Data.Vector.Unboxed instead of using one-dimensional Data.Array.Repa. Except for some minor improvements, the algorithms are the same. Data.Vector.Unboxed seems to be 4 times faster than one-dimensional Data.Array.Repa for sequential operations.

Thomas Vanhelden
  • 879
  • 8
  • 20