Questions tagged [intel-mkl]

Intel MKL (Math Kernel Library) is a high performance math library specifically optimised for Intel processors. Its core functions include BLAS and LAPACK linear algebra routines, fast Fourier transforms and vector math functions amongst others.

Intel MKL (Math Kernel Library) is a high performance math library specifically optimised for Intel processors and explicitly parallelised with a version specifically available for High End Supercomputer clusters. Its core functions include BLAS and LAPACK linear algebra routines, fast Fourier transforms and vector math functions amongst others.

Intel MKL only supports Intel and compatible processors and is available for Windows, Linux and OS X as part of Intel® Parallel Studio and Intel® System Studio. There are free versions available for Students and Academic researchers at qualifying institutions.

The Intel® Math Kernel Library includes the following groups of routines:

  • Basic Linear Algebra Subprograms (BLAS):
    • vector operations
    • matrix-vector operations
    • matrix-matrix operations
  • Sparse BLAS Level 1, 2, and 3 (basic operations on sparse vectors and matrices)
  • LAPACK routines for solving systems of linear equations
  • LAPACK routines for solving least squares problems, eigenvalue and singular value problems, and Sylvester's equations
  • Auxiliary, utility, and test LAPACK routines
  • ScaLAPACK computational, driver and auxiliary routines (only in Intel MKL for Linux* and Windows* operating systems)
  • PBLAS routines for distributed vector, matrix-vector, and matrix-matrix operation
  • Direct and Iterative Sparse Solver routines, including a solver based on the PARDISO* sparse solver and the Intel MKL Parallel Direct Sparse Solver for Clusters
  • Direct Sparse Solver (DSS)
  • Extended Eigensolver routines for solving symmetric standard or generalized symmetric definite eigenvalue problems using the Feast algorithm
  • Vector Mathematical Library (VML) functions for computing core mathematical functions on vector arguments (with Fortran and C interfaces)
  • Vector Statistical Library (VSL) functions for generating vectors of pseudorandom numbers with different types of statistical distributions and for performing convolution and correlation computations
  • General Fast Fourier Transform (FFT) Functions, providing fast computation of Discrete Fourier Transform via the FFT algorithms and having Fortran and C interfaces
  • Cluster FFT functions (only in Intel MKL for Linux* and Windows* operating systems)
  • Tools for solving partial differential equations - trigonometric transform routines and Poisson solver
  • Optimization Solver routines for solving nonlinear least squares problems through the Trust-Region (TR) algorithms and computing Jacobi matrix by central differences
  • Basic Linear Algebra Communication Subprograms (BLACS) that are used to support a linear algebra oriented message passing interface
  • Data Fitting functions for spline-based approximation of functions, derivatives and integrals of functions, and search
798 questions
7
votes
6 answers

High-performance Math library for .NET /C# and Java

We currently have a high-performance scientific application written in C++ that makes use of Intel Math Kernel Library. We are considering writing a benchmark application written in Java and .NET/C# to compare the performance difference. To do that,…
sivabudh
  • 31,807
  • 63
  • 162
  • 228
7
votes
1 answer

How to avoid this four-line memory leak with NumPy+MKL?

The following simple four-line code produces a memory leak in my Python 2.6.6 / NumPy 1.7.0 / MKL 10.3.6 setup: import numpy as np t = np.random.rand(10,10) while True: t = t / np.trace(t) With each operation, the used memory grows by the size…
cm_
  • 153
  • 6
7
votes
1 answer

Supposed automatically threaded scipy and numpy functions aren't making use of multiple cores

I am running Mac OS X 10.6.8 and am using the Enthought Python Distribution. I want for numpy functions to take advantage of both my cores. I am having a problem similar to that of this post: multithreaded blas in python/numpy but after following…
Nino
  • 411
  • 4
  • 15
6
votes
1 answer

How to enable and disable Intel MKL in numpy Python?

I want to test and compare Numpy matrix multiplication and Eigen decomposition performance with Intel MKL and without Intel MKL. I have installed MKL using pip install mkl (Windows 10 (64-bit), Python 3.8). I then used examples from here for matmul…
user40
  • 1,361
  • 5
  • 19
  • 34
6
votes
1 answer

Intel MKL error using Conda and matplotlib: "Library not loaded: @rpath/libiomp5.dylib" on macOS

I'm using a conda environment for a project and when I install matplotlib I get the following error when attempting to run python: (conda environment path)/bin/python (Project path)/src/__init__.py INTEL MKL ERROR: dlopen((conda environment…
6
votes
1 answer

Conda install r-essentials with MKL

On my RHEL-server I do not have admin rights, but I can create Conda environments. I would like to create a Conda environment running R with Intel MKL (Intel® Math Kernel Library). I create the environment with R_defaults.yml, running $> conda env…
Geir Inge
  • 179
  • 3
  • 10
6
votes
0 answers

Getting max FLOPS for dense matrix multiplication with the Xeon Phi Knights Landing

I recently started working with a Xeon Phi Knights Landing (KNL) 7250 computer (http://ark.intel.com/products/94035/Intel-Xeon-Phi-Processor-7250-16GB-1_40-GHz-68-core). This has 68 cores and AVX 512. The base frequency is 1.4 GHz and the Turbo…
Z boson
  • 32,619
  • 11
  • 123
  • 226
6
votes
1 answer

Intel MKL Error with Gaussian Fitting in Python?

I'm doing a Monte Carlo simulation in Python in which I obtain a set of intensities at certain 2D coordinates and then fit a 2D Gaussian to them. I'm using the scipy.optimize.leastsq function and it all seems to work well except for the following…
blah1234
  • 131
  • 1
  • 1
  • 4
6
votes
1 answer

passing a noncontiguous array section in Fortran

I am using intel fortran compiler and intel mkl for a performance check. I am passing some array sections to Fortran 77 interface with calls like call dgemm( transa,transb,sz_s,P,P,& a, Ts_tilde,& …
Umut Tabak
  • 1,862
  • 4
  • 26
  • 41
6
votes
1 answer

Numpy np.einsum array multiplication using multiple cores

I have compiled numpy 1.6.2 and scipy with MKL hoping to have a better performance. Currently I have a code that relies heavily on np.einsum(), and I was told that einsum is not good with MKL, because there is almost none vectorization. =( So I was…
tcapelle
  • 442
  • 3
  • 12
6
votes
2 answers

MKL Performance on Intel Phi

I have a routine that performs a few MKL calls on small matrices (50-100 x 1000 elements) to fit a model, which I then call for different models. In pseudo-code: double doModelFit(int model, ...) { ... while( !done ) { cblas_dgemm(...); …
Andrew
  • 867
  • 7
  • 20
6
votes
3 answers

Numpy-MKL for OS X

I love being able to use Christoph Gohlke's numpy-MKL version of NumPy linked to Intel's Math Kernel Library on Windows. However, I have been unable to find a similar version for OS X, preferably NumPy 1.7 linked for Python 3.3 on Mountain Lion.…
MattDMo
  • 100,794
  • 21
  • 241
  • 231
5
votes
1 answer

Unable to install Scipy with MKL using Meson

I am attempting to install scipy 1.9.1 with the MKL implementation of BLAS, using pip as my package manager. For numpy, I can do this with: pip install numpy --no-binary numpy. Doing this with Scipy (pip install scipy--no-binary scipy) fails with…
Evan Gaertner
  • 51
  • 1
  • 3
5
votes
2 answers

conda env broken after installing pytorch on M1 - Intel MKL FATAL ERROR

I installed pytorch on my M1 mac book, following some instructions on-line (via conda command). Then my whole environment got corrupted. Whenever I try to import some library (pandas, numpy, whatever) I get this: Intel MKL FATAL ERROR: This system…
5
votes
0 answers

How can I have a finer control on number of threads used for each BLAS kernel call on CPU?

I am writing an OpenMP code calling different BLAS kernels, mostly DGEMMs with different sizes, in different threads. To maximize performance I want to have control over the number of threads I am calling for each BLAS. It seems that it is a very…
Aznaveh
  • 558
  • 8
  • 27
1 2
3
53 54