3

At least this it how it appears. The following code behaved correctly for 3x3 and 6x6 matrices.

    deter = mat.det('method':'berkowitz')
    #self.resultFileHandler.writeLogStr(str(deter))
    a = sy_polys.Poly(deter, k)

For 3x3 it takes ~0.8s to execute this code, for 6x6 it takes ~288s (with only 650ms for the det function, the rest for the Poly). For 10x10, either the complexity has ramped at a colossal rate or some other reason is preventing it returning from the Poly call (I waited a week). No exceptions are thrown.

The elements of the determinants consist of large symbolic polynomials.

I was on 0.7.1 and just upgraded to 1.0 (problem in both versions).

I added the logging to try and get the determinant to file but it sticks again in the str(deter) function call. If I break my debugger can't display the deter (prob too large for the debugger).

Here is a stack:

MainThread - pid_135368_id_42197520 
  _print [printer.py:262]   
  _print_Add [str.py:56]    
  _print [printer.py:257]   
  parenthesize [str.py:29]  
  _print_Mul [str.py:290]   
  _print [printer.py:257]   
  _print_Add [str.py:56]    
  _print [printer.py:257]   
  parenthesize [str.py:29]  
  _print_Mul [str.py:290]   
  _print [printer.py:257]   
  _print_Add [str.py:56]    
  _print [printer.py:257]   
  parenthesize [str.py:29]  
  _print_Mul [str.py:290]   
  _print [printer.py:257]   
  _print_Add [str.py:56]    
  _print [printer.py:257]   
  parenthesize [str.py:29]  
  _print_Mul [str.py:290]   
  _print [printer.py:257]   
  _print_Add [str.py:56]    
  _print [printer.py:257]   
  parenthesize [str.py:29]  
  _print_Mul [str.py:290]   
  _print [printer.py:257]   
  _print_Add [str.py:56]    
  _print [printer.py:257]   
  parenthesize [str.py:29]  
  _print_Mul [str.py:290]   
  _print [printer.py:257]   
  _print_Add [str.py:56]    
  _print [printer.py:257]   
  parenthesize [str.py:29]  
  _print_Mul [str.py:290]   
  _print [printer.py:257]   
  _print_Add [str.py:56]    
  _print [printer.py:257]   
  parenthesize [str.py:29]  
  _print_Mul [str.py:290]   
  _print [printer.py:257]   
  _print_Add [str.py:56]    
  _print [printer.py:257]   
  parenthesize [str.py:29]  
  _print_Mul [str.py:290]   
  _print [printer.py:257]   
  _print_Add [str.py:56]    
  _print [printer.py:257]   
  parenthesize [str.py:29]  
  _print_Mul [str.py:290]   
  _print [printer.py:257]   
  _print_Add [str.py:56]    
  _print [printer.py:257]   
  doprint [printer.py:233]  
  sstr [str.py:748] 
  __str__ [basic.py:396]    
  _getRoots_sympy_Poly_nroots [__init__.py:91]  
  getRoots [__init__.py:68] 
  findPolyRoots [__init__.py:697]   
  _getNroots [polefinder.py:97] 
  _doForN [polefinder.py:60]    
  _incN [polefinder.py:52]  
  __init__ [polefinder.py:39]   
  _doPoleFind [polefinderwrap.py:33]    
  _polesForPos [polefinderwrap.py:47]   
  <module> [polefinderwrap.py:60]   
  run [pydevd.py:937]   
  <module> [pydevd.py:1530] 

OK, I've got an exception from the str function. Seems likely that the polynomial has become too large.

Traceback (most recent call last):
  File "E:\Peter's Documents\PhD\Code\Git\ProtoQScat\multichannel\qscat\ratsmat.\polefinder.py", line 60, in _doForN
    roots = self._getNroots(N)
  File "E:\Peter's Documents\PhD\Code\Git\ProtoQScat\multichannel\qscat\ratsmat.\polefinder.py", line 97, in _getNroots
    roots = ratSmat.findPolyRoots(False)
  File "E:\Peter's Documents\PhD\Code\Git\ProtoQScat\multichannel\qscat\numerical/..\ratsmat\__init__.py", line 697, in findPolyRoots
    roots = self.polyRootSolve.getRoots(mat, k)
  File "E:\Peter's Documents\PhD\Code\Git\ProtoQScat\multichannel\qscat\numerical/..\ratsmat\__init__.py", line 68, in getRoots
    ret = self._getRoots_sympy_Poly_nroots(mat, k)
  File "E:\Peter's Documents\PhD\Code\Git\ProtoQScat\multichannel\qscat\numerical/..\ratsmat\__init__.py", line 91, in _getRoots_sympy_Poly_nroots
    self.resultFileHandler.writeLogStr(str(deter))
  File "C:\Python27\lib\site-packages\sympy\core\basic.py", line 396, in __str__
    return sstr(self, order=None)
  File "C:\Python27\lib\site-packages\sympy\printing\str.py", line 748, in sstr
    s = p.doprint(expr)
  File "C:\Python27\lib\site-packages\sympy\printing\printer.py", line 233, in doprint
    return self._str(self._print(expr))
  File "C:\Python27\lib\site-packages\sympy\printing\printer.py", line 257, in _print
    return getattr(self, printmethod)(expr, *args, **kwargs)
  File "C:\Python27\lib\site-packages\sympy\printing\str.py", line 56, in _print_Add
    t = self._print(term)
  File "C:\Python27\lib\site-packages\sympy\printing\printer.py", line 257, in _print
    return getattr(self, printmethod)(expr, *args, **kwargs)
  File "C:\Python27\lib\site-packages\sympy\printing\str.py", line 290, in _print_Mul
    a_str = [self.parenthesize(x, prec) for x in a]
  File "C:\Python27\lib\site-packages\sympy\printing\str.py", line 29, in parenthesize
    return "(%s)" % self._print(item)
  File "C:\Python27\lib\site-packages\sympy\printing\printer.py", line 257, in _print
    return getattr(self, printmethod)(expr, *args, **kwargs)
  File "C:\Python27\lib\site-packages\sympy\printing\str.py", line 56, in _print_Add
    t = self._print(term)
  File "C:\Python27\lib\site-packages\sympy\printing\printer.py", line 257, in _print
    return getattr(self, printmethod)(expr, *args, **kwargs)
  File "C:\Python27\lib\site-packages\sympy\printing\str.py", line 290, in _print_Mul
    a_str = [self.parenthesize(x, prec) for x in a]
  File "C:\Python27\lib\site-packages\sympy\printing\str.py", line 29, in parenthesize
    return "(%s)" % self._print(item)
  File "C:\Python27\lib\site-packages\sympy\printing\printer.py", line 257, in _print
    return getattr(self, printmethod)(expr, *args, **kwargs)
  File "C:\Python27\lib\site-packages\sympy\printing\str.py", line 56, in _print_Add
    t = self._print(term)
  File "C:\Python27\lib\site-packages\sympy\printing\printer.py", line 257, in _print
    return getattr(self, printmethod)(expr, *args, **kwargs)
  File "C:\Python27\lib\site-packages\sympy\printing\str.py", line 290, in _print_Mul
    a_str = [self.parenthesize(x, prec) for x in a]
  File "C:\Python27\lib\site-packages\sympy\printing\str.py", line 29, in parenthesize
    return "(%s)" % self._print(item)
  File "C:\Python27\lib\site-packages\sympy\printing\printer.py", line 257, in _print
    return getattr(self, printmethod)(expr, *args, **kwargs)
  File "C:\Python27\lib\site-packages\sympy\printing\str.py", line 69, in _print_Add
    return sign + ' '.join(l)
MemoryError

EDIT: Following from answer below here is a profile plot with the determinant size (channel). Ignore N (on y axis) it is another parameter of the calculation (governs the size of the polys in the elements). enter image description here

Jibbity jobby
  • 1,255
  • 2
  • 12
  • 26
  • 1
    I don't know exactly how sympy is doing it under the hood, but note that determinants are n! complexity going straight from the definition (and if it's trying to get perfect precision, maybe that's what it's doing). With that complexity, a week is very plausible – en_Knight Jun 30 '16 at 14:34
  • The thing is though, it's not sticking in the det call, rather the str or Poly call after. – Jibbity jobby Jun 30 '16 at 14:36
  • What is the variable k? – en_Knight Jun 30 '16 at 14:38
  • It's the independent variable of the polynomials in the elements. The element polynomials consist of a single variable, k. – Jibbity jobby Jun 30 '16 at 14:47
  • I'm starting to think this must be the n! complexity. Strange that it doesn't stick on the det. The memory error though indicates that an absolutely massive polynomial must be the calculated result. – Jibbity jobby Jun 30 '16 at 20:14
  • 1
    Not that strange. I bet it lazily evaluates for as long as possible - if it defers the final computation to the end, it can make simplifications along the way. Trivial example ( X - X ) == 0 in O(1), no matter how complex or slow it would be to evaluate X. That's what symbolic computation is good for, right :)? – en_Knight Jun 30 '16 at 20:17
  • 1
    Related question https://stackoverflow.com/questions/37026935/speeding-up-computation-of-symbolic-determinant-in-sympy – asmeurer Jul 01 '16 at 16:19
  • That's useful but I should add that my problem was never with det (which seems relatively fast and appears to scale nearly linearly) but when ever I use the returned result from det. – Jibbity jobby Jul 01 '16 at 19:35
  • In that case, the issue is likely the fact that `Poly` is very slow with highly multivariate polynomials, which is an issue with the dense representation it uses. The fix here is to avoid `Poly`. Perhaps you can use instead the new `ring()` polys system. – asmeurer Jul 05 '16 at 06:17
  • I tried using the rings.sring function but again no results after about a week. I didn't profile so cannot really say if a differentce was made to the performance. – Jibbity jobby Jul 10 '16 at 15:36
  • I should also add that the elements are only monomial quadatics. I feel something strange is going on here. – Jibbity jobby Jul 11 '16 at 22:17

2 Answers2

2

The algorithm is just slow.

Sympy explains the Berkowitz method in its documentation, and references "On computing the determinant in small parallel time using a small number of processors" ; for its implementation, look at the open-source sympy code.

The complexity of Berkowitz is pretty hard to understand, and it looks like if you don't want to brute force the proof of its correctness then you need to get involved in some pretty hairy combinatorics.

The algorithm is fast for highly parrallized architectures; it's mainly motivated by the fact that Gaussian Ellimination doesn't parralelize well. Formally, its in the class NC^2. I might guess that your tests weren't being run on such an architecture. Some researchers into the algorithm seem to be contributors on CS.SE, if you have more questions on that topic.

The Polynomial Call is Slow

From the docs, there are multiple ways of constructing a polynomial, dependent on what type of collection is passed into the constructor (list [1], tuple [2], or dictionary [3]); they result in different validation and have very different performance. I would point you to this note in that documentation (emphasis mine, capitalization their's):

For interactive usage choose [1] as it’s safe and relatively fast.

CAUTION: Use [2] or [3] internally for time critical algorithms, when you know that coefficients and monomials will be valid sympy expressions. Use them with caution! If the coefficients are integers instead of sympy integers (e.g. 1 instead of S(1)) the polynomial will be created but you may run into problems if you try to print the polynomial. If the monomials are not given as tuples of integers you will have problems.


Sympy also reserves the right to lazily evaluate expressions until their output is needed. This is a significant part of the benefit of symbolic calculations - mathematical simplification can result in precision gains and performance gains, but it also may mean that the actual evaluation of complex expressions may be delayed until unexpected times.

Community
  • 1
  • 1
en_Knight
  • 5,301
  • 2
  • 26
  • 46
  • Thank you en_Knight. Give me some time to study here before I accept. – Jibbity jobby Jun 30 '16 at 20:42
  • @PeterBingham Take your time :) The quoted block on switching order from lexigraphical to something else might be the easiest way to see an immediate performance gain – en_Knight Jun 30 '16 at 20:44
  • I've looked at the docs and they've been very helpful. Point out thought that my understanding is that the yellow background text refers to the initialisation method and not the ordering. Anyway, thanks for highlight the polynomial as the source of the problem. I'd profiled all the other numerical calls but assumed (wrongly) that this was fairly innocuous. I'll add a profile plot to the question and you can see how dramatic increasing the size is. – Jibbity jobby Jun 30 '16 at 21:24
  • @PeterBingham you're right, it's not the ordering, they used similar notation for both and I got confused :) I'll edit – en_Knight Jun 30 '16 at 21:26
1

OK, so I returned to this after reading literature and feeling (emphasis here) that the Berkowitz should perform between O(n^2) and O(n^3).

What I found was that the input type to the det and Poly makes a huge difference to the performance (I admit that the input type was not shown in my question). Wrapping the original expression in a poly drastically improves performance.

Consider the following three codes

1: using MatrixSymbol. det takes 1.1s then still stuck at str after 30min

from sympy import MatrixSymbol, Matrix

X = MatrixSymbol('X', 10, 10)
Xmat = Matrix(X)

deter = Xmat.det(method='berkowitz')
str(deter)

2: Represents my original problem code. det takes 1.8s then still stuck at Poly after 30min

import sympy
from sympy import Matrix, I
from sympy.polys import Poly

matSz = 10

m = Matrix([[0.0]*matSz]*matSz)
x = sympy.symbols('x')
for i in range(matSz):
  for j in range(matSz):
    m[i,j] = 2.0*float(i+1)*x**2 + 2.0*float(j+1)*x - 5.0*float(i+1)

deter = m.det(method='berkowitz')
deter_poly = Poly(deter, x)  #Required or exception
roots = deter_poly.nroots()

3: Above but with m[i,j] = poly(. det takes 3.0s, Poly 0.04 and nroots 0.27

import sympy
from sympy import Matrix, I
from sympy import poly
from sympy.polys import Poly

matSz = 10

m = Matrix([[0.0]*matSz]*matSz)

x = sympy.symbols('x')
for i in range(matSz):
  for j in range(matSz):
    m[i,j] = poly(2.0*float(i+1)*x**2 + 2.0*float(j+1)*x*I - 5.0*float(i+1))

deter = m.det(method='berkowitz')
deter_poly = Poly(deter, x)  #Required or exception
roots = deter_poly.nroots()
Jibbity jobby
  • 1,255
  • 2
  • 12
  • 26
  • en_Knight. I made this the accepted answer for the sake of reference since it describes how I basically sorted the problem and in my case the algorithm is not too slow (at least for n=10). – Jibbity jobby Jul 12 '16 at 14:05