3

Given a Pandas dataframe, what is the best way (readability OR execution speed) to convert to a cvxopt matrix or vice versa?

Currently I am doing:

cvxMat = matrix(pdObj.as_matrix())
pdObj[:]=np.array(cvxMat)

Also, is there a reasonably readable way of doing vector or matrix algebra using a mixture of cvxopt matrices and pandas dataframes without converting the objects?

The following is a vector dot product (pdObj & cvxMat are column vectors) that is far from readable:

(matrix(pdObj.as_matrix()).T*cvxMat)[0]

Any advice?


Follow-up to waitingkuo's answer:

Just for illustration with pandas dataframes:

>>> m1 = cvxopt.matrix([[1, 2, 3], [2, 3, 4]])
>>> m2 = pd.DataFrame(np.array(m1)).T

>>> m1
<3x2 matrix, tc='i'>

>>> m2.shape
(2, 3)

>>> np.dot(m1,m2)
array([[ 5,  8, 11],
       [ 8, 13, 18],
       [11, 18, 25]])

But note:

>>> m1 * m2
   0  1   2
0  1  4   9
1  4  9  16

[2 rows x 3 columns]
ARF
  • 7,420
  • 8
  • 45
  • 72
  • Similar: https://stackoverflow.com/questions/12551009/python3-conversion-between-cvxopt-matrix-and-numpy-array/45933678#45933678 – 0 _ Aug 29 '17 at 07:59

2 Answers2

2

You can get the numpy array from pandas by pdObj.values

You can do matrix multiplication between the cvxopt matrix and numpy matrix directly

In [90]: m1 = cvxopt.matrix([[1, 2, 3], [2, 3, 4]])

In [91]: m2 = np.matrix([[1, 2, 3], [2, 3, 4]])

In [92]: m1
Out[92]: <3x2 matrix, tc='i'>

In [94]: m2.shape
Out[94]: (2, 3)

In [95]: m1 * m2
Out[95]: 
matrix([[ 5,  8, 11],
        [ 8, 13, 18],
        [11, 18, 25]]) 
waitingkuo
  • 89,478
  • 28
  • 112
  • 118
  • Many thanks. I was having some trouble translating your answer to pandas dataframes as m1 * m2 does element-wise multiplication here. The only solution I found is np.dot(m1,m2), since cvxopt matrices have no .dot() method. --> m1.dot(m2) unfortunately is not possible. – ARF Apr 15 '14 at 09:27
  • If your m2 is a numpy array, probably it make more sense to transfer it to numpy matrix or cvxopt matrix first. i.e. `m1 * cvxopt.matrix(m2)` – waitingkuo Apr 15 '14 at 11:08
  • 1
    My m2 is a pandas dataframe. The expression then becomes: `m1 * cvxopt.matrix(m2.values)`. That's ok for simple expressions but not ideal for more complicated expressions. - One can't have everything though, I suppose. For now I have decided to overload the `__new__` method of `cvxopt.matrix` to at least avoid having to use the `.values` property of the pandas dataframe. The expression then at least becomes: `m1 * matrix(m2)` – ARF Apr 15 '14 at 14:06
1

An alternative to messing with cvxopt __init__ is to define your own dot;
A or B can be numpy arrays, or array-like, or anything with .value or .values:

def dot( A, B ):
    """ np.dot .value or .values if they exist """
    for val in "value values" .split():
        A = getattr( A, val, A )  # A.val or A
        B = getattr( B, val, B )
    A = np.asanyarray( A )
    B = np.asanyarray( B )
    try: 
        np.dot( A, B )
    except ValueError:
        print >>sys.stderr, "error: can't dot shapes %s x %s" % (A.shape, B.shape)
        raise

(Bytheway I avoid matrices, stick to numpy arrays and vecs -- a separate issue.)

denis
  • 21,378
  • 10
  • 65
  • 88
  • Thank for reminding me of this. Use of matrices is obligatory with cvxopt. Hence my question on how to convert back and forth. – ARF Apr 15 '14 at 16:08