12

What do I need? [an unordered list]

  • VERY easy parallelization
  • support for map, filter etc.
  • ability to perform array based computations efficiently, like A=B+C, sort of like matlab arrays.
  • Generation of SIMD code. I guess this is out of the question in the near future for anything, but hey, I can ask :)
  • support for matrices should be there at a minimum, higher dimensions are lower priority right now.
  • ability to get a pointer to it and create one from a C pointer.
  • Support from other libraries. IE, bindings to popular C math packages, i/o to disk or images if the arrays are 2D

What do I see?

  • Array package in haskell-platform. It's the blessed one and can do parallel
  • Data.Vector. Has loop fusion, but not in platform, so its maturity is unknown to me.
  • repa package, contributed by the DPH team, but doesn't work well with any stable ghc today.
  • Lots of variation in the level of support for array implementations. For instance, there doesn't seem to be an easy way to dump a 2D vector to a image file. IOW, the haskell community apparently hasn't settled on an array implementation.

So please, help me choose.

EDIT A=B+C refers to element wise addition, and not list concatenation

Community
  • 1
  • 1
rpg
  • 975
  • 1
  • 8
  • 18
  • [cons](http://en.wikipedia.org/wiki/Cons) – nmichaels Mar 04 '11 at 16:46
  • Wait, aren't your requirements asking for an *ordered* list? I don't see how `A=B+C` could make sense on an unordered list unless `+` is `union`. – Dan Burton Mar 05 '11 at 17:37
  • @Dan I think you misunderstood. The request for efficient `A=B+C` was referring to numeric addition, not concatenation. So in Data.Vector that would be `zipWith (+)`. – Thomas M. DuBuisson Mar 10 '11 at 19:51
  • @TomMD `zipWith` works by pairing element 1 of the first list with element 1 of the second list. In other words, the order of elements in the lists matters. When a list is unordered, that means the order of the elements doesn't matter, which is why I said `A=B+C` is nonsense in such a scenario. (if there is a "1st" element, it is arbitrary, since the list's order doesn't mean anything) – Dan Burton Mar 10 '11 at 20:35
  • @Dan If rpg wishs to add vectors in the manner `[a, b, c] + [x, y, z] --> [a+x, b+y, c+z]` then `zipWith` is exactly what is needed. I think we are arguing without agreeing on what operation is desired. – Thomas M. DuBuisson Mar 10 '11 at 20:59
  • @TomMD I'm not trying to argue; I think you're absolutely right. I'm just saying that vectors (which rpg is apparently asking for) are not "unordered". – Dan Burton Mar 10 '11 at 23:23
  • 2
    @dan Oh, I get it! You think he wants the array not to have ordering. He only mentioned "unordered list" as in the bullets of what properties he desires of a vector library are in no particular order or preference. – Thomas M. DuBuisson Mar 11 '11 at 00:27
  • Tom is right, the vectors aren't sorted, and A=B+C refers to element wise addition here – rpg Mar 11 '11 at 05:04

3 Answers3

8

Correct, the community hasn't settled on a good array implementation. I think it would be a good Haskell Prime submission to put forward the Vector API and remove Data.Array.

Vector is very mature! It has:

  • VERY easy parallelization
  • support for map, filter etc.
  • performs array based computations efficiently, like A=B+C (but I'm not in tune with how matlab does it)
  • vector creation from a pointer via Vector.Storable

It does not:

  • have enough support from other libraries. IE, bindings to popular C math packages
  • support matrices, but you can have vectors of vectors. If you build some vector-based matrix operations then perhaps you could upload to hackage as vector-matrix.
  • Generate SIMD code.

NOTE: You can turn bytestrings into vectors of whatever, so if you have an image as a bytestring then, via Vector.Storable, you might be able to do what you want with the image as a vector.

Thomas M. DuBuisson
  • 64,245
  • 7
  • 109
  • 166
  • The parallelism seems limited to boxed vectors, thereby losing a lot of the benefit. – rpg Mar 04 '11 at 18:11
  • @rpg Yes, I made `vector-strategies` using the obvious combinations of `parallel` and `deepseq` tools. It's not like you can get parallel unboxed vectors without going to something backed by more research (is repa unboxed?), the primitives just aren't there to support such work. – Thomas M. DuBuisson Mar 04 '11 at 20:10
  • repa is unboxed only. They don't even check indices before referencing them. – rpg Mar 05 '11 at 04:00
  • Have you looked at the interface? It's horrendous. – Thomas M. DuBuisson Mar 05 '11 at 16:07
3

(I am not allowed to comment)

rpg: Does hmatrix accept Data.Vector? It has a Data.Packed.Vector but are they the same?

Yes. The last version of hmatrix uses by default Data.Vector.Storable for 1D vectors (previously it was optional). The dependency on vector is not shown in Hackage, probably because it is in a configuration flag.

For LAPACK compatibility matrices are not Vector or Vector t, but they can be easily converted (e.g.: Data.Vector.fromList . toRows).

TryTryAgain
  • 7,632
  • 11
  • 46
  • 82
Alberto Ruiz
  • 411
  • 2
  • 2
2

If you want bindings to popular C libraries, the best options are probably hmatrix and blas. Blas is just a binding to a BLAS library, whereas hmatrix provides some higher-level operations. There are also many libraries built upon hmatrix offering further functionality. If you're doing any sort of matrix work, that's what I would start with.

The vector package is also a good choice; it's stable and provides excellent performance. The Data.Vector.Storable types are represented as C arrays, so it's trivial to interface from them to other C libraries. The biggest drawback is that there's no matrix support, so you'd have to do that yourself.

As for exporting to an image format, most haskell image libraries seem to use ByteStrings. You could either convert to a ByteString, or bind to a C library that does what you want. If you find a Haskell library that does what you want, it should be easy enough to convert hmatrix data to the proper format.

John L
  • 27,937
  • 4
  • 73
  • 88
  • Does hmatrix accept Data.Vector? It has a Data.Packed.Vector but are they the same? – rpg Mar 04 '11 at 17:44
  • @rpg They can't be the same without `hmatrix` depending on `vector`, so no. If you click on the haddock documentation you can see source links which shows their `Vector` [definition](http://hackage.haskell.org/packages/archive/hmatrix/0.11.0.1/doc/html/src/Data-Packed-Internal-Vector.html#Vector). – Thomas M. DuBuisson Mar 04 '11 at 20:09