Max pool a single image in tensorflow using "tf.nn.avg_pool"

Question

I want to apply "tf.nn.max_pool()" on a single image but I get a result with dimension that is totally different than the input:

import tensorflow as tf
import numpy as np

ifmaps_1 = tf.Variable(tf.random_uniform( shape=[ 7, 7, 3], minval=0, maxval=3, dtype=tf.int32))

ifmaps=tf.dtypes.cast(ifmaps_1, dtype=tf.float64)

ofmaps_tf = tf.nn.max_pool([ifmaps], ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1], padding="SAME")[0] # no padding

init = tf.initialize_all_variables()
with tf.Session() as sess:
    sess.run(init)
    print("ifmaps_tf = ")
    print(ifmaps.eval())
    print("ofmaps_tf = ")
    result = sess.run(ofmaps_tf)
    print(result)

I think this is related to trying to apply pooling to single example not on a batch. I need to do the pooling on a single example.

Any help is appreciated.

Do you want to max or avg pool? There is a conflict between your code and statement. — zihaozhihao, Oct 30 '19 at 20:52
Sorry for this confusion. I need both of them at the end and they both have the same issue in the result dimensions. Big thanks for your attention to details; I corrected it now. — mnabil, Oct 30 '19 at 21:18
So what's your expected result dimensions? At least, the result looks correct to me. Your input is `(7,7,3)`, and your output is `(4,4,3)` because you use `SAME` padding. If you do not want pad, you should use `VALID` instead of `SAME`. Your comments and codes are contradicted. — zihaozhihao, Oct 30 '19 at 21:54
Here's a good SO-post about how to calculate the output size of a convolution layer. The same applies to Max./Avg.-pooling layers: https://stackoverflow.com/questions/53580088/calculate-the-output-size-in-convolution-layer — Tinu, Oct 31 '19 at 09:16
@zihaozhihao you are write. I wrongly reversed the padding schemes labels and passed "SAME" as no padding! Now, when I pass "VALID", the dimensions are correct! Can you please write that as an answer to accept it. Big thanks for your help and attention to details! — mnabil, Oct 31 '19 at 09:33

zihaozhihao · Accepted Answer · 2019-10-31T16:14:09.217

Your input is (7,7,3), kernel size is (3,3) and stride is (2,2). So if you do not want any paddings, (state in your comment), you should use padding="VALID", that will return a (3,3) tensor as output. If you use padding="SAME", it will return (4,4) tensor.

Usually, the formula of calculating output size for SAME pad is:

out_size = ceil(in_sizei/stride)

For VALID pad is:

out_size = ceil(in_size-filter_size+1/stride)

Max pool a single image in tensorflow using "tf.nn.avg_pool"

1 Answers1