Questions tagged [blas]

The Basic Linear Algebra Subprograms are a standard set of interfaces for low-level vector and matrix operations commonly used in scientific computing.

A reference implementation is available at NetLib; optimized implementations are also available for all high-performance computing architectures, for example:

The BLAS routines are divided into three levels:

  • Level 1: vector operations e.g. vector addition, dot product
  • Level 2: matrix-vector operations e.g. matrix-vector multiplication
  • Level 3: matrix-matrix operations e.g. matrix multiplication
906 questions
10
votes
2 answers

ATLAS gemm linking undefined reference to 'cblas_sgemm'

This is the first time I am trying to use ATLAS. I am not able to link it properly. Here is a very simple sgemm program: ... #include const int M=10; const int N=8; const int K=5; int main() { float *A = new float[M*K]; float *B…
usman
  • 1,285
  • 1
  • 14
  • 25
9
votes
2 answers

Should I prefer stride one memory access for either reading or writing?

It's well known that accessing memory in a stride one fashion is best for performance. In situations where I must access one region of memory for reading, I must access another region for writing, and I may only access one of the two regions in a…
Rhys Ulerich
  • 1,242
  • 1
  • 12
  • 28
9
votes
2 answers

Calling MATLAB's built-in LAPACK/BLAS routines

I want to learn how to call the built-in LAPACK/BLAS routines in MATLAB. I have experience in MATLAB and mex files but I've actually no idea how to call LAPACK or BLAS libraries. I've found the gateway routines in file exchange that simplifies the…
petrichor
  • 6,459
  • 4
  • 36
  • 48
9
votes
2 answers

Use BLAS and LAPACK from Eigen

I've implemented a piece of code with Eigen and I would like Eigen to use BLAS and LAPACK . I've seen here, that is possible but I don't know how or where to put those values/directives in the code. I have to expecify somewhere the value…
Santi Peñate-Vera
  • 1,053
  • 4
  • 33
  • 68
9
votes
1 answer

Find BLAS include directory with CMake

In CMake I use find_package(BLAS REQUIRED) and I use the BLAS_FOUND, BLAS_LINKER_FLAGS, BLAS_LIBRARIES variables as appropriate. My question is, how do I, based on the BLAS implementation that has been selected, find the include directory that…
9
votes
3 answers

Julia Memory Allocation for Addition of Two Matrices in place

I'm curious why Julias implementation of matrix addition appears to make a copy. Heres an example: foo1=rand(1000,1000) foo2=rand(1000,1000) foo3=rand(1000,1000) julia> @time foo1=foo2+foo3; 0.001719 seconds (9 allocations: 7.630 MB) julia>…
Lindon
  • 1,292
  • 1
  • 10
  • 21
9
votes
1 answer

How to measure overall performance of parallel programs (with papi)

I asked myself what would be the best way to measure the performance (in flops) of a parallel program. I read about papi_flops. This seems to work fine for a serial program. But I don't know how I can measure the overall performance of a parallel…
Sebastian
  • 153
  • 1
  • 8
9
votes
1 answer

How to check which BLAS is in my Ubuntu system?

In particular, I would like to know if xianyi's OpenBLAS has been installed. I work on several PCs and had it installed in several PCs over the past couple of years, but I lost track which were not installed with it. I need to know which PC has it…
ng0323
  • 317
  • 1
  • 6
  • 14
9
votes
1 answer

strange results when benchmarking numpy with atlas and openblas

I try to evalaute the performance of numpy linked to ATLAS compared to numpy linked to OpenBLAS. I get some strange results for ATLAS which I describe below. The Python code for evaluating matrix-matrix multiplication (aka sgemm) looks like…
rocksportrocker
  • 7,251
  • 2
  • 31
  • 48
9
votes
1 answer

Numpy.dot bug? Inconsistent NaN behavior

I noticed an inconsistent behavior in numpy.dot when nans and zeros are involved. Can anybody make sense of it? Is this a bug? Is this specific to the dot function? I'm using numpy v1.6.1, 64bit, running on linux (also tested on v1.6.2). I also…
shx2
  • 61,779
  • 13
  • 130
  • 153
9
votes
3 answers

Armadillo (+BLAS) using GPU

Is it possible to run armadillos calculations using GPU? Is there any way to use the GPU blas libraries (for example cuBLAS) with armadillo? Just a note, I am totally new to GPU programming.
Milan Domazet
  • 113
  • 1
  • 6
9
votes
5 answers

BLAS library incompatible with Fortran 77 compiler settings

I'm trying to install Octave-3.6.2 from source on Ubuntu 12.04 with KDE desktop but when I run the Octave configure script I get this error BLAS library was detected but found incompatible with your Fortran 77 compiler settings I used ./configure…
babelproofreader
  • 530
  • 1
  • 6
  • 20
9
votes
0 answers

Efficient way of computing matrix product AXA'?

I'm currently using BLAS function DSYMM to compute Y = AX and then DGEMM for YA', but I'm wondering is there some more efficient way of computing the matrix product AXAT, where A is an arbitrary n×n matrix and X is a symmetric n×n matrix?
Jouni Helske
  • 6,427
  • 29
  • 52
8
votes
3 answers

Armadillo + BLAS + LAPACK: Linking error?

When I try to compile example1.cpp that comes with Armadillo 2.4.2, I keep getting the following linking error: /tmp/ccbnLbA0.o: In function `double arma::blas::dot(unsigned int, double const*, double…
Marc
  • 532
  • 3
  • 5
  • 14
8
votes
4 answers

Initialize double array with nonzero values (BLAS)

I have allocated a big double vector, lets say with 100000 element. At some point in my code, I want to set all elements to a constant, nonzero value. How can I do this without using a for loop over all elements? I am also using the BLAS package, if…
Günter
  • 81
  • 2