19

I have a Scipy sparse CSR matrix created from sparse TF-IDF feature matrix in SVM-Light format. The number of features is huge and it is sparse so I have to use a SparseTensor or else it is too slow.

For example, number of features is 5, and a sample file can look like this:

0 4:1
1 1:3 3:4
0 5:1
0 2:1

After parsing, the training set looks like this:

trainX = <scipy CSR matrix>
trainY = np.array( [0,1,00] )

I have two important questions:

1) How I do convert this to a SparseTensor (sp_ids, sp_weights) efficiently so that I perform fast multiplication (W.X) using lookup: https://www.tensorflow.org/versions/master/api_docs/python/nn.html#embedding_lookup_sparse

2) How do I randomize the dataset at each epoch and recalculate sp_ids, sp_weights to so that I can feed (feed_dict) for the mini-batch gradient descent.

Example code on a simple model like logistic regression will be very appreciated. The graph will be like this:

# GRAPH
mul = tf.nn.embedding_lookup_sparse(W, X_sp_ids, X_sp_weights, combiner = "sum")  # W.X
z = tf.add(mul, b) #  W.X + b


cost_op = tf.reduce_sum(tf.nn.sigmoid_cross_entropy_with_logits(z, y_true))  # this already has built in sigmoid apply
train_op = tf.train.GradientDescentOptimizer(0.05).minimize(cost_op)  # construct optimizer

predict_op = tf.nn.sigmoid(z) # sig(W.X + b)
Salman Mohammed
  • 269
  • 1
  • 3
  • 9

1 Answers1

43

I can answer the first part of your question.

def convert_sparse_matrix_to_sparse_tensor(X):
    coo = X.tocoo()
    indices = np.mat([coo.row, coo.col]).transpose()
    return tf.SparseTensor(indices, coo.data, coo.shape)

First you convert the matrix to COO format. Then you extract the indices, values, and shape and pass those directly to the SparseTensor constructor.

Dave DeCaprio
  • 2,051
  • 17
  • 31
  • 3
    instead of tf.SparseTensor in return statement (which threw an exception for me) I used tf.SparseTensorValue and it worked great for me. – Ash Jun 27 '17 at 05:44
  • 4
    You may want to reorder: `return tf.sparse.reorder(tf.SparseTensor(indices, coo.data, coo.shape))` should do. – M.Winkens Aug 14 '20 at 13:18