Questions tagged [blas]

The Basic Linear Algebra Subprograms are a standard set of interfaces for low-level vector and matrix operations commonly used in scientific computing.

A reference implementation is available at NetLib; optimized implementations are also available for all high-performance computing architectures, for example:

The BLAS routines are divided into three levels:

Level 1: vector operations e.g. vector addition, dot product
Level 2: matrix-vector operations e.g. matrix-vector multiplication
Level 3: matrix-matrix operations e.g. matrix multiplication

906 questions

votes

1 answer

Is GEMM or BLAS used in Tensorflow, Theano, Pytorch

I know that Caffe uses GEneral Matrix to Matrix Multiplication (GEMM) which is part of Basic Linear Algebra Subprograms (BLAS) library for performing convolution operations. Where a convolution is converted to matrix multiplication operation. I have…

asked Aug 13 '18 at 01:27

Gaurav Srivastava

votes

4 answers

LAPACK/BLAS versus simple "for" loops

I want to migrate a piece of code that involves a number of vector and matrix calculations to C or C++, the objective being to speed up the code as much as possible. Are linear algebra calculations with for loops in C code as fast as calculations…

c++ c lapack blas performance

asked Feb 21 '11 at 03:14

behzad.nouri

74,723
18
126
124

votes

3 answers

BLAS matrix by matrix transpose multiply

I have to calculate some products in the form A'A or more general A'DA, where A is a general mxn matrix and D is a diagonal mxm matrix. Both of them are full rank; i.e.rank(A)=min(m,n). I know that you can save a substantial time is such symmetric…

matrix linear-algebra blas

asked Oct 30 '17 at 10:58

enanone

votes

0 answers

Do numpy or scipy implement sub-cubic multiplication

I've searched quite a bit, but I've only found homegrown reimplementations of Strassen matrix multiplication. Wikipedia says that numpy uses BLAS (which includes a high-performance implementations of sub-cubic matrix multiplication algorithms, e.g.…

python numpy big-o matrix-multiplication blas

asked Nov 02 '15 at 06:54

user

7,123
7
48
90

votes

1 answer

How to perform Vector-Matrix Multiplication with BLAS ?

BLAS defines the GEMV (Matrix-Vector Multiplication) level-2 operation. How to use a BLAS Library to perform Vector-Matrix Multiplication ? It's probably obvious, but I don't see how to use BLAS operation for this multiplication. I would have…

matrix matrix-multiplication blas

asked Apr 06 '15 at 13:35

Baptiste Wicht

7,472
7
45
110

votes

1 answer

OpenBLAS routine used from R/Rcpp runs only on a single core in linux

I am trying to run a QR decomposition (LAPACKE_dgeqrf) in R on a linux machine (CentOS) using a C++ program that is interfaced with Rcpp. Unfortunately, I see only 100% using top. This also happens on a Red Hat Enterprise Linux Server. However, the…

r blas

asked Feb 12 '14 at 14:20

chris

votes

1 answer

dgemm segfaulting with large F-order matrices in scipy

I'm attempting to compute A*A.T in Python using SciPy's dgemm, but getting a segfault when A has large row dimension (~50,000) and I pass the matrices in in F-order. Of course, the resulting matrix is very large, but both sgemm and passing to dgemm…

python numpy matrix scipy blas

asked Dec 10 '13 at 17:48

Brielin Brown

votes

2 answers

R loop getting slower and slower

I am struggling to understand why this bit of code (adapted from the R Benchmark 2.5) becomes slower and slower (on average) as the number of iteration increases. require(Matrix) c <- 0; for (i in 1:100) { a <- new("dgeMatrix", x = rnorm(3250 *…

performance r benchmarking blas

asked Jun 23 '13 at 21:19

RenéR

votes

5 answers

Numpy and Scipy installation on windows

I have installed Numpy successfully. But on the site , there is lot of things that I have to do such as building Numpy, Scipy, downloading ATLAS, LAPACK etc. I am really confused and even I have checked some of the other queries also. Still not able…

python numpy scipy blas atlas

asked Jan 11 '13 at 07:49

Hemant

votes

1 answer

Are BLAS Level 1 procedures still relevant for modern fortran compilers?

Most of the BLAS Level 1 API can be trivially written straight forward using Fortran 9x+ vectorized assignments and intrinsic procedures. Assuming you are using a modern optimizing compiler, like Intel Fortran, and correct target-specific compiler…

fortran blas

asked Oct 16 '12 at 22:41

abbot

27,408
6
54
57

votes

2 answers

How to accelerate matrix multiplications in Python?

I am developing a small neural network whose parameters need a lot of optimization, so a lot of processing time. I have profiled my script with cProfile and what takes 80% of the processor time is the NumPy dot function, the rest is matrix inversion…

python optimization numpy parallel-processing blas

asked Sep 02 '12 at 19:20

PierreE

votes

1 answer

Difference between dtrtrs and dtrsm

I am looking for some triangular solvers, and I have come across two solvers. One in BLAS: dtrsm and another in LAPACK: dtrtrs. From the looks of it both seem to have common functionality, with dtrsm having a little bit more functionality (scaling…

linear-algebra lapack blas

asked Jun 29 '11 at 00:02

Pavan Yalamanchili

12,021
2
35
55

votes

1 answer

In R how to control multi-threading in BLAS parallel matrix product

I have a question regarding the use of BLAS parallelized matrix product in R (being the default matrix product at least since R-3.4, maybe earlier). The default behavior (at least on my machine) is now for the matrix product (c.f. example below) to…

r multithreading matrix blas

asked Aug 21 '17 at 10:00

Odin

votes

3 answers

How can I make use of intel-mkl with tensorflow

I've seen a lot of documentation about making using of a CPU with tensorflow, however, I don't have a GPU. What I do have is a fairly capable CPU and a holing 5GB of intel math kernel, which, I hope, might help me speed up tensorflow a fair…

python c++ numpy tensorflow blas

asked Sep 25 '16 at 07:04

George H

votes

1 answer

Spark netlib-java BLAS

i am trying to troubleshoot my non-working apache spark and netlib setup and i don't know what to do next. Here some info: Spark 1.3.1 (but also tried 1.5.1) Mesos Cluster with 3 Nodes Ubuntu Trusty on every node and installed following BLAS…

apache-spark blas netlib

asked Mar 29 '16 at 13:49

wobu

Prev 1 2 3

…

60 61 Next