9

I have the following code:

dotp = np.dot(X[i], w)
mult = -Y[i] * dotp
lhs = Y[i] * X[i]
rhs = logistic(mult)
s += lhs * rhs

And it throws me the following error (truncated for brevity):

  File "/Users/leonsas/Projects/temp/learners/learners.py", line 26, in log_likelihood_grad
    s += lhs * rhs
  File "/usr/local/lib/python2.7/site-packages/numpy/matrixlib/defmatrix.py", line 341, in __mul__
    return N.dot(self, asmatrix(other))

 `ValueError: matrices are not aligned`

I was expecting lhs to be a column vector and rhs to be a scalar and so that operation should work. To debug, I printed out the dimensions:

    print "lhs", np.shape(lhs)
    print  "rhs", rhs, np.shape(rhs)

Which outputs:

lhs (1, 18209)
rhs [[ 0.5]] (1, 1)

So it seems that they are compatible for a multiplication. Any thoughts as to what am I doing wrong?

EDIT: More information of what I'm trying to do.

This code is to implement a log-likehood gradient to estimate coefficients.

enter image description here

Where z is the dot product of the weights with the x values.

My attempt at implementing this:

def log_likelihood_grad(X, Y, w, C=0.1):
    K = len(w)
    N = len(X)
    s = np.zeros(K)

    for i in range(N):
        dotp = np.dot(X[i], w)
        mult = -Y[i] * dotp
        lhs = Y[i] * X[i]
        rhs = logistic(mult)
        s += lhs * rhs

    s -= C * w

    return s
leonsas
  • 4,718
  • 6
  • 43
  • 70
  • Please specify which line triggers the exception. Also, ``.size`` is the total number of elements in the array; what you need to print (and include in your question) is ``.shape``. – fjarri Dec 16 '14 at 23:33
  • What are you trying to do with `lhs` and `rhs`, from a mathematical perspective? – Rufflewind Dec 16 '14 at 23:46

2 Answers2

8

You have a matrix lhs of shape (1, 18209) and rhs of shape (1, 1) and you are trying to multiply them. Since they're of matrix type (as it seems from the stack trace), the * operator translates to dot. Matrix product is defined only for the cases where the number of columns in the first matrix and the number of rows in the second one are equal, and in your case they're not (18209 and 1). Hence the error.

How to fix it: check the maths behind the code and fix the formula. Perhaps you forgot to transpose the first matrix or something like that.

fjarri
  • 9,546
  • 39
  • 49
  • I tried to reproduce the above with: `a = np.random.uniform(0,10,(1,10)); b = np.array([[0.5]]); a * b;` and it does work, so I guess numpy assumes, in this case, that the (1,1) matrix is scalar. Why doesn't this happen above as well? – leonsas Dec 17 '14 at 00:26
  • [There are ``ndarray`` and ``matrix`` types in numpy](http://wiki.scipy.org/NumPy_for_Matlab_Users). There are differences between them, in particular ``*`` translates to elementwise multiplication for arrays and to ``dot`` for matrices. Do ``print type(lhs), type(rhs)`` to check what you have. In your question you can see in the stack trace that ``dot`` gets called, so I assumed you had ``matrix`` objects. In this test you get ``ndarray`` objects, which results in broadcasted multiplication, for which the shapes are ok. – fjarri Dec 17 '14 at 00:42
  • Both are type `matrix`. I changed the flawed line to `s += lhs * np.float64(rhs)` hoping it'll work, and it does work until it throws: `ValueError: non-broadcastable output operand with shape (18209) doesn't match the broadcast shape (1,18209)`. Thoughts? (This might be getting off-topic) – leonsas Dec 17 '14 at 00:47
  • If you want to convert matrices to arrays, use [``numpy.asarray``](http://docs.scipy.org/doc/numpy/reference/generated/numpy.asarray.html). ``numpy.float64`` just converts your 1x1 matrix to a scalar. – fjarri Dec 17 '14 at 00:52
1

vectors' shape on numpy lib are like (3,). when you try to multiply them with np.dot(a,b) func it gives dim error. np.outer(a,b) func should be used at this point.

guler
  • 11
  • 1