0

minΣ(||xi-Xci||^2+ λ||ci||),

s.t cii = 0,

where X is a matrix of shape d * n and C is of the shape n * n, xi and ci means a column of X and C separately.

X is known here and based on X we want to find C.

xxx222
  • 2,980
  • 5
  • 34
  • 53

1 Answers1

2

Usually with a loss like that you need to vectorize it, instead of working with columns:

loss = X - tf.matmul(X, C)
loss = tf.reduce_sum(tf.square(loss))

reg_loss = tf.reduce_sum(tf.square(C), 0)  # L2 loss for each column
reg_loss = tf.reduce_sum(tf.sqrt(reg_loss))

total_loss = loss + lambd * reg_loss

To implement the zero constraint on the diagonal of C, the best way is to add it to the loss with another constant lambd2:

reg_loss2 = tf.trace(tf.square(C))
total_loss = total_loss + lambd2 * reg_loss2
Olivier Moindrot
  • 27,908
  • 11
  • 92
  • 91
  • Thank you so much! I was thinking of using tf.slice() to get a column of the matrix, do you think that would work as well? What is the mechanism of tf.slice()? – xxx222 Jul 15 '16 at 01:55
  • That would also work as the gradient would backpropagate to the original variable C (and X), but it would be very inefficient – Olivier Moindrot Jul 15 '16 at 01:56