How to do Temporary Modifying Tensorflow Weights Efficiently

Question

I've been through so much discussion* about how to do Variable (Tensor) slice assignment in Tensorflow 2.0 from this site. Most of them recommend using method assign() or tensor_scatter_nd_update() which can be found from the official documentation: https://www.tensorflow.org/api_docs/python/tf/Variable. Each of these methods will returns the entire Tensor that hold the new values. This behavior (in my hypotesis) is very inefficient when we want to do slice assignments repeatedly on a large Tensor. How can I avoid this without needing to change the source code? Do you have a better approach?

In my case, I want to implement Gabor Convolutional Networks (GCNs)** which requires modulation (multiplication) of each convolution weight by the Gabor filter banks (which is limited to orientation only, Gabor Orientation Filters/GoFs). This process takes place on the forwardpass and does not change the weight value to be updated by the gradient descent (if I'm not mistaken in reading the article).

To simplify the case, I only use a Gabor filter (rather than GoFs) for testing purposes. Here's my current code:

def call(self, inputs):
    new_kernel = self.kernel
    for idx_channel in range(self.kernel.shape[2]):
        for idx_filter in range(self.filters):
            _ = new_kernel[:, :, idx_channel, idx_filter].assign(
                new_kernel[:, :, idx_channel, idx_filter] *
                self.gabor_kernel
            )

    outputs = K.conv2d(
        inputs,
        new_kernel,
        strides=self.strides,
        padding=self.padding,
        data_format=self.data_format,
        dilation_rate=self.dilation_rate)

    if self.use_bias:
        outputs = K.bias_add(
            outputs,
            self.bias,
            data_format=self.data_format)

    if self.activation is not None:
        return self.activation(outputs)
    return outputs

For consideration, I carried out the operation in the form of numpy.array with the code below:

def call(self, inputs):
    new_kernel = K.eval(self.kernel)
    for idx_channel in range(self.kernel.shape[2]):
        for idx_filter in range(self.filters):
            new_kernel[:, :, idx_channel, idx_filter] *= self.gabor_kernel
    new_kernel = K.constant(new_kernel)
    ...

The last method is very efficient, but the drawback is the weight's gradients becomes None.

I really appreciate every little help from you guys. Thanks.

*Like from How to do slice assignment in Tensorflow

**Luan, S., Chen, C., Zhang, B., Han, J., & Liu, J. (2018). Gabor convolutional networks. IEEE Transactions on Image Processing, 27(9), 4357-4366. Retrieved from https://arxiv.org/pdf/1705.01450.pdf

To be honest, I cannot even "run" the training process because the modulation operation above is too expensive. :( — Naufan Rusyda Faikar, Mar 29 '20 at 17:56
I calculated the level of efficiency by putting `print()` in the loop and seeing how fast the results appear in the model building process. — Naufan Rusyda Faikar, Mar 29 '20 at 18:07
You can reproduce the case with this code: https://gist.github.com/nrfaikar/80c60ce207a88e9777d8b21989b0cdc0 — Naufan Rusyda Faikar, Mar 29 '20 at 18:26
Hi! I tried to reproduce the results in TF Keras 1.14.0. I checked the layer's gradient in both the methods using `layer.get_weights()` and was able to see the weights instead of `None`. — Rishab P, Apr 11 '20 at 10:03
Thank you for your effort. So, does the training process can even run? The Tensor `assign` function has inhibited me for that. — Naufan Rusyda Faikar, Apr 11 '20 at 18:07

How to do Temporary Modifying Tensorflow Weights Efficiently

0 Answers0