It's interesting that Data.Array.Repa is actually faster than hmatrix, which is unexpected since hmatrix is implemented using LAPACK. Is this because Repa uses the unboxed type?
import Data.Array.Repa
import Data.Array.Repa.Algorithms.Matrix
main = do
let
a = fromListUnboxed (Z:.1000:.1000::DIM2) $ replicate (1000*1000) 1.0 :: Array U DIM2 Double
b = fromListUnboxed (Z:.1000:.1000::DIM2) $ replicate (1000*1000) 1.0 :: Array U DIM2 Double
m <- (a `mmultP` b)
print $ m!(Z:.900:.900)
running time with 1 core: 7.011s
running time with 2 core: 3.975s
import Numeric.LinearAlgebra
import Numeric.LinearAlgebra.LAPACK
main = do
let
a = (1000><1000) $ replicate (1000*1000) 1.0
b = (1000><1000) $ replicate (1000*1000) 1.0
print $ (a `multiplyR` b) @@> (900,900)
Running time: 20.714s