10

Why has TensorFlow chosen to prefer padding on the bottom right?

With SAME padding, to me it would feel logical to start the kernel's center anchor at the first real pixel. Due to the use of asymmetric padding, this results in a discrepancy with some other frameworks. I do understand that asymmetric padding in principle is good because otherwise one would be left with an unused padding row/column.

If TensorFlow would have given presedence to padding on the left and top, it would do convolutions and weights the same as Caffe/cudnn/$frameworks, and weight conversion would be compatible regardless of padding.

TF gives bottom- and right- padding precedence

Code:

import numpy as np
import tensorflow as tf
import torch
import torch.nn as nn

tf.enable_eager_execution()

def conv1d_tf(data, kernel_weights, stride):
    filters = np.reshape(kernel_weights, [len(kernel_weights), 1, 1])
    out = tf.nn.conv1d(
        value=data,
        filters=filters,
        stride=stride,
        padding='SAME',
        data_format='NCW',
        )
    return out


def conv1d_pytorch(data, kernel_weights, stride):
    filters = np.reshape(kernel_weights, [1, 1, len(kernel_weights)])
    kernel_size = len(kernel_weights)
    size = data.shape[-1]
    def same_padding(size, kernel_size, stride, dilation):
        padding = ((size - 1) * (stride - 1) + dilation * (kernel_size - 1)) //2
        return padding
    padding = same_padding(size=size, kernel_size=kernel_size, stride=stride, dilation=0)
    conv = nn.Conv1d(
        in_channels=1,
        out_channels=1,
        kernel_size=kernel_size,
        stride=stride,
        bias=False,
        padding=padding,
        )
    conv.weight = torch.nn.Parameter(torch.from_numpy(filters))
    return conv(torch.from_numpy(data))


data = np.array([[[1, 2, 3, 4]]], dtype=np.float32)
kernel_weights = np.array([0, 1], dtype=np.float32)
stride = 2

out_tf = conv1d_tf(data=data, kernel_weights=kernel_weights, stride=stride)
out_pytorch = conv1d_pytorch(data=data, kernel_weights=kernel_weights, stride=stride)

print('TensorFlow: %s' % out_tf)
print('pyTorch: %s' % out_pytorch)

Output:

TensorFlow: tf.Tensor([[[2. 4.]]], shape=(1, 1, 2), dtype=float32)
pyTorch: tensor([[[1., 3.]]], grad_fn=<SqueezeBackward1>)
double-beep
  • 5,031
  • 17
  • 33
  • 41
TimZaman
  • 2,689
  • 2
  • 26
  • 36
  • This does not answer your question, but this is not the only case where frameworks differ in padding (and incompatibilities don't only apply between TF and other frameworks) – etarion Mar 21 '17 at 15:21
  • Etarion: interesting! Elaborate? – TimZaman Mar 22 '17 at 15:53
  • One example I know is that Caffe and Apple's Metal Performance Shaders behave differently when you apply 3x3 pooling with stride 2 on a feature map with even sizes and no padding (at least you say you want no padding) ... Caffe acts like there's an implicit pixel of padding on the right/bottom side and produces a feature map that is one pixel larger than the one MPS produces. – etarion Mar 22 '17 at 18:04

1 Answers1

5

This is for historical compatibility reasons with previous (non-public) frameworks. It is unfortunate that the definitions aren't clearer, since it's a common stumbling block when porting between different libraries.

Pete Warden
  • 2,866
  • 1
  • 13
  • 12