What are b, y, x and c which get flattened and returned along with the max-pooled features in tf.nn.max_pool_with_argmax?

Question

I went through the documentation of tf.nn.max_pool_with_argmax where it is written

Performs max pooling on the input and outputs both max values and indices.

The indices in argmax are flattened, so that a maximum value at position [b, y, x, c] becomes flattened index ((b * height + y) * width + x) * channels + c.

The indices returned are always in [0, height) x [0, width) before flattening, even if padding is involved and the mathematically correct answer is outside (either negative or too large). This is a bug, but fixing it is difficult to do in a safe backwards compatible way, especially due to flattening.

The variables b, y, x and c haven't been explicitly defined hence I was having issues implementing this method. Can someone please provide the same.

score 0 · Answer 1 · answered Dec 24 '18 at 15:25

0

I am unable to comment due to reputation.

But I think the variables are referencing the position and size of the Max Pooling window. x and y are the x and y position of the kernel as it moves along the input matrix and b and c are the width and height of the kernel. You would set b and c in kernel size.

If you are having a problem implementing max pooling with argmax it has little to do with these variables. You might want to specify the issue you are having with Max Pooling.

answered Dec 24 '18 at 15:25

James Kl

177
9

As you would have seen these values are returned in flattened format. So to extract them I would have to apply some math like: c = ((returned_value) % channels), and so on, after doing this I had to use the co-ordinates of the pixel obtained to perform a clustering operation. Now my problem is, whether x and y are the co-ordinates of the pixel in the original image or (b,c) are the co-ordinates. I went through the github source code and still cannot find what b,c,x and y are. – Anubhav Pandey Dec 25 '18 at 05:18

What are b, y, x and c which get flattened and returned along with the max-pooled features in tf.nn.max_pool_with_argmax?

1 Answers1