0

I am exploring the scipy.linalg.get_blas_function() method in python. But I am noticing some difficulties it has in handling numpy int types.

Input

import scipy.linalg
import numpy as np

def blas(name, ndarray):
    arr = scipy.linalg.get_blas_funcs((name,), (ndarray,), dtype=ndarray.dtype)[0]
    return arr

blas_scal = blas('scal', np.array([], dtype=np.int32))
print "int32 --> %s" % (blas_scal.dtype)

blas_scal = blas('scal', np.array([], dtype=np.int64))
print "int64 --> %s" % (blas_scal.dtype)

blas_scal = blas('scal', np.array([], dtype=np.float32))
print "float32 --> %s" % (blas_scal.dtype)

blas_scal = blas('scal', np.array([], dtype=np.float64))
print "float64 --> %s" % (blas_scal.dtype)

Output

int32 --> float64
int64 --> float64
float32 --> float32
float64 --> float64

As you can see, the get_blas_funcs() seems to ignore the data type when it is an integer, and outputs float64 whenever you have any sort of numpy integer. However, inputting a numpy float32 or numpy64 is fine with get_blas_funcs(), and it doesn't change the dtype as it goes through to output.

What is happening here? How does this function handle data types?

quanty
  • 824
  • 1
  • 12
  • 21

1 Answers1

0

Some general comments:

  • BLAS is more a specification than a library
    • There are many implementations (Atlas, OpenBLAS, MKL), all more or less working on the same interface
  • BLAS is all about floating-point math
    • The available prefixes decide about the type in use (see top left)
    • There is S: single-precision, D: double-precision, C complex single-precision and Z complex double-precision
      • (sometimes mixed types are supported like: CS, ZD)
        • For example, the function scasum uses a complex input array and returns a real value. (link above)
    • In general there is no support for integer types!
      • There seems to be some ideas to support those (links: 1, 2), but it surely is not common (and maybe there is no robust / well-tested support yet)
      • I'm pretty sure there is no single integer-like BLAS usage in any of those big libs (numpy, scipy, sklearn, pandas), which of course would be hard without a standard supported by all possible BLAS-backends (as i said: multiple candidates!)
  • This implies: you are asking for a function, which is not available (scal with integer-like types)
    • This means: casting needs to be done!
    • What kind of casting is done, is a design-decision:
      • Here it seems (probably easy to check the code if needed):
        • If type one of the supported ones, keep it
          • float32 -> float32 (S -> S)
          • float64 -> float 64 (D -> D)
        • If not: cast to double-precision / D (more or less default in numpy/scipy!; more common than single-precision; at least nowadays!)
  • Using BLAS is very low-level and everyone trying to do it needs to be sure to understand what's needed to do it right (not only types; but also memory layout)!
    • Compare with scipy's warning: Warning These functions do little to no error checking. It is possible to cause crashes by mis-using them, so prefer using the higher-level routines in scipy.linalg.
  • Somewhat related question here (showing the implications of these missing features): Why is it faster to perform float by float matrix multiplication compared to int by int?
sascha
  • 32,238
  • 6
  • 68
  • 110
  • Thanks, excellent answer. I've not done 'casting' before, would you mind providing some links to useful educational pages for it? – quanty Jan 15 '18 at 08:26
  • Ah, casting is just the movement between different data types. – quanty Jan 15 '18 at 09:38