44

I'd like to calculate the mathematical rank of a matrix using scipy. The most obvious function numpy.rank calculates the dimension of an array (ie. scalars have dimension 0, vectors 1, matrices 2, etc...). I am aware that the numpy.linalg.lstsq module has this capability, but I was wondering if such a fundamental operation is built into the matrix class somewhere.

Here is an explicit example:

from numpy import matrix, rank
A = matrix([[1,3,7],[2,8,3],[7,8,1]])
print rank(A)

This gives 2 the dimension, where I'm looking for an answer of 3.

Hooked
  • 84,485
  • 43
  • 192
  • 261
  • I checked the rank using Mathematica - it's indeed 3. The function you're calling in Python is either incorrect or you're using it wrong. – duffymo Mar 18 '10 at 23:37
  • 3
    The usage is correct - this is what baffled me in the first place. In the post I explain what rank does: it calculates the dimensionality of the array. A "rank-3" array would be a list-of-lists-of-lists. – Hooked Mar 18 '10 at 23:44
  • Note that the term "rank" is somewhat ambiguous. For a tensor, the rank tells you the number of indices (e.g. a scalar is a rank-0 tensor, a vector rank-1 and a matrix rank-2). For linear algebra there is also the definition you cite above. From the docstring, it's clear that Numpy uses the former. – Rupert Nash Mar 29 '10 at 13:17

7 Answers7

62

Numpy provides numpy.linalg.matrix_rank():

>>> import numpy
>>> numpy.__version__
'1.5.1'
>>> A = numpy.matrix([[1,3,7],[2,8,3],[7,8,1]])
>>> numpy.linalg.matrix_rank(A)
3
Simon
  • 31,675
  • 9
  • 80
  • 92
  • 5
    How can I find the rank for **integer** matrices **modulo n**? In Mathematica there is this function MatrixRank[..., Modulus -> n], but how can I realize this function in Python? – Everett You Jun 18 '15 at 08:09
16

To provide a rough code snippet for people who need to get this done in practice. Feel free to improve.

u, s, v = np.linalg.svd(A)
rank = np.sum(s > 1e-10)
Stefan van der Walt
  • 7,165
  • 1
  • 32
  • 41
8

If numpy does not offer a rank facility, why don't you write your own?

An efficient way to compute the rank is via the Singular Value Decomposition - the rank of the matrix is equal to the number of non-zero singular values.

def rank(A, eps=1e-12):
    u, s, vh = numpy.linalg.svd(A)
    return len([x for x in s if abs(x) > eps])

Notice that eps depends in your application - most would agree that 1e-12 corresponds to zero, but you may witness numerical instability even for eps=1e-9.

Using your example, the answer is three. If you change the second row to [2, 6, 14] (linearly dependent with row one) the answer is two (the "zero" eigenvalue is 4.9960E-16)

Praveen
  • 6,872
  • 3
  • 43
  • 62
Escualo
  • 40,844
  • 23
  • 87
  • 135
3

This answer is out of date.

The answer is no—there is currently no function dedicated to calculating the matrix rank of an array/matrix in scipy. Adding one has been discussed before, but if it's going to happen, I don't believe it has yet.

Mike Graham
  • 73,987
  • 14
  • 101
  • 130
  • 1
    Incidentally, http://mail.scipy.org/pipermail/numpy-discussion/2008-February/031214.html is the first post of a short newsgroup discussion about this. – Mike Graham Mar 19 '10 at 00:24
  • 5
    Nowadays, there is `numpy.linalg.matrix_rank()`. See my answer. – Simon Dec 02 '11 at 05:17
1

I don't know about Numpy in particular, but that's unlikely to be a built-in operation on a matrix; it involves fairly intensive numerical computations (and associated concerns about floating-point roundoff error and so forth) and threshold selections that may or may not be appropriate in a given context, and algorithm selection is important to computing it accurately and quickly.

Things that are built into the basic classes tend to be things that can be performed in a unique and straightforward manner, such as matrix multiplications at the most complex.

Brooks Moses
  • 9,267
  • 2
  • 33
  • 57
  • This is a good point, a numerically unstable matrix could cause the rank to change due to roundoff errors. However, this is a known problem and I was wondering if the scipy/numpy libraries directly have a function. If the answer is no - that's fine too, I can always go with a SVD. – Hooked Mar 18 '10 at 23:52
  • 1
    It's not just numerically-unstable ones. How about {{1.0, 3.0}, {1.0/3.0, 1.0}}? The division can't produce an exact answer, so should this get counted as rank 1, or rank 2? – Brooks Moses Mar 19 '10 at 18:17
1

The linear algebra functions are generally grouped in numpy.linalg. (They're also available from scipy.linalg, which has more functionality.) This allows polymorphism: the functions can accept any of the types that SciPy handles.

So, yes, the numpy.linalg.lstsq function does what you're asking. Why is that insufficient?

bignose
  • 30,281
  • 14
  • 77
  • 110
  • 2
    It does what I'm asking - but it does a lot more unnecessarily, and with a large amount of baggage. The same could have been accomplished with a LU decomposition and then a row sort. The intent of the question - if this wasn't clear, was if a function existed whose sole purpose was to calculate the rank. Ie. take in a matrx, spit out an int. – Hooked Mar 18 '10 at 23:48
1

scipy now contains an efficient interpolative method for estimating the rank of a matrix/LinearOperator using random methods, which can often be accurate enough:

>>> from numpy import matrix
>>> A = matrix([[1,3,7],[2,8,3],[7,8,1]], dtype=float)  # doesn't accept int

>>> import scipy.linalg.interpolative as sli
>>> sli.estimate_rank(A, eps=1e-10)
3
jawknee
  • 168
  • 1
  • 9