Note that this is essentially the same question as here but unfortunately the solution does not work in general even though it was accepted as the answer. Even more importantly, one of the main use cases for the ufunc.at methods, when there are multiple identical indices, is where it fails.
For now we will take np.add
as the ufunc to demonstrate in the rest of our code.
Consider the objects
row_idx: A 1D numpy array of integers with shape=(N,)
col_idx: A 1D numpy array of integers with shape=(N,)
values : A 1D numpy array of floats with shape=(N,)
M : A 2D numpy array of floats with shape=(k,k)
Importantly the row and column combinations are not unique i.e. the following does NOT hold
(row_idx[i], col_idx[i]) == (row_idx[j], col_idx[j]) => i==j (1)
That means, to achieve the update of the following function
def update_loop(row_idx, col_idx, M):
for i in range(row_idx.size):
M[row_idx[i], col_idx[i]] += values[i]
The following way will not work (maybe unexpectedly for some) when (1) doesn't hold.
def update_index(row_idx, col_idx, M):
M[row_idx, col_idx] += values
But fortunately we can use
def update_ufunc(row_idx, col_idx, M):
np.add.at(M, (row_idx, col_idx), values)
When dealing with the same issue in Tensorflow, the answer in the linked question suggested using sparse matrices as essentially the values (row_idx, col_idx, values)
represent a sparse matrix.
In particular if we have the following objects
row_idx_tf: A 2D tensorflow tensor of integers with shape=(N,1)
col_idx_tf: A 2D tensorflow tensor of integers with shape=(N,1)
values_tf : A 2D tensorflow tensor of floats with shape=(N,1)
M_tf : A 2D tensorflow tensor of floats with shape=(k,k)
The suggested code is essentially in our case
idx = tf.concat((row_idx, col_idx), 1)
sparse_tensor = tf.SparseTensor(idx, values, [k,k])
tf.sparse_add(M_tf, sparse_tensor)
However, if we want this to behave like the np.add.at
method, this hinges on the fact that when creating the sparse tensor, if we come across an index i
such that (row_idx[i], col_idx[i])
already exists in the tensor then values[i]
is added to the current entry in the tensor. This is mentioned in the comments of the previous question as it is apparently the default behavior of Scipy sparse csr matrices. I can confirm that it is not the behavior of the Tensorflow sparse tensors.
So considering the method with the sparse tensors does not work, what is the most efficient way to perform the equivalent of the Numpy unfunc.at methods in Tensorflow?
N.B. I want to make sure that the method will make full use of the power of available HPC hardware. Essentially we have N
independent updates to perform where the only thing we have to be careful of is multiple threads trying to update the same entry in M
at the same time. Other than that it seems perfectly parallelisable.