1

I'm trying to apply the first technique described here (https://arxiv.org/abs/1510.00149). Basically I want to lighten a ResNet Convolutional network modifying the way in which the weights are stored and used - briefly I want to transform each weight matrix in to a matrix of indexes, where each index correspond to a vector of k values, where each of these values are the centroid over k clusters given by the original matrix.

To do that I need to change the way in which each weight are stored and used for the convolution. I did that over a small CNN implemented in Tensorflow, I was able to do because with the operation of tf.nn.conv2d (https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/nn/conv2d) you pass directly the weight that TF will use for the convolution.

The problem now is that I'm trying also to apply this method into a ResNet implementation using tf-slim (https://github.com/google-research/tf-slim/tree/8f0215e924996d7287392241bc8d8b1133d0c5ca), in this implementation each Convolution operation is applied in the model using instead tf.python.layers (https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/layers/Conv2D) - in this method you don't have access to the matrix of weights, you simply pass to Tensorflow the input and the number of filters that you want to apply (with other parameters of course) and the function will return only the result of the convolution, hiding you the part of the weights that are stored and trained automatically by Tensorlfow.

The question is: how can I have access to it? There is a way to put hand to the implementation part of this convolution? I need only to find a way to find a way to pass the real weight matrix to the function.

Thank you.

p.s. Just to give u some code and concrete example: I want to transform this tf.slim.convolution implementation in a way that I take in the weights of the model and I update them in a way to restore a classical weight matrix and pass it to the convolution.

layer = tf.layers.Conv2D(
    filters=num_outputs,
    kernel_size=kernel_size,
    strides=stride,
    padding=padding,
    data_format=df,
    dilation_rate=rate,
    activation=None,
    use_bias=not normalizer_fn and biases_initializer,
    kernel_initializer=weights_initializer,
    bias_initializer=biases_initializer,
    kernel_regularizer=weights_regularizer,
    bias_regularizer=biases_regularizer,
    activity_regularizer=None,
    trainable=trainable,
    name=sc.name,
    dtype=inputs.dtype.base_dtype,
    _scope=sc,
    _reuse=reuse)
outputs = layer.apply(inputs)
ceradini
  • 455
  • 1
  • 3
  • 11

0 Answers0