2

I want to apply "tf.nn.max_pool()" on a single image but I get a result with dimension that is totally different than the input:

import tensorflow as tf
import numpy as np

ifmaps_1 = tf.Variable(tf.random_uniform( shape=[ 7, 7, 3], minval=0, maxval=3, dtype=tf.int32))

ifmaps=tf.dtypes.cast(ifmaps_1, dtype=tf.float64)

ofmaps_tf = tf.nn.max_pool([ifmaps], ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1], padding="SAME")[0] # no padding

init = tf.initialize_all_variables()
with tf.Session() as sess:
    sess.run(init)
    print("ifmaps_tf = ")
    print(ifmaps.eval())
    print("ofmaps_tf = ")
    result = sess.run(ofmaps_tf)
    print(result)

I think this is related to trying to apply pooling to single example not on a batch. I need to do the pooling on a single example.

Any help is appreciated.

mnabil
  • 695
  • 1
  • 5
  • 19
  • 1
    Do you want to max or avg pool? There is a conflict between your code and statement. – zihaozhihao Oct 30 '19 at 20:52
  • Sorry for this confusion. I need both of them at the end and they both have the same issue in the result dimensions. Big thanks for your attention to details; I corrected it now. – mnabil Oct 30 '19 at 21:18
  • 1
    So what's your expected result dimensions? At least, the result looks correct to me. Your input is `(7,7,3)`, and your output is `(4,4,3)` because you use `SAME` padding. If you do not want pad, you should use `VALID` instead of `SAME`. Your comments and codes are contradicted. – zihaozhihao Oct 30 '19 at 21:54
  • 1
    Here's a good SO-post about how to calculate the output size of a convolution layer. The same applies to Max./Avg.-pooling layers: https://stackoverflow.com/questions/53580088/calculate-the-output-size-in-convolution-layer – Tinu Oct 31 '19 at 09:16
  • @zihaozhihao you are write. I wrongly reversed the padding schemes labels and passed "SAME" as no padding! Now, when I pass "VALID", the dimensions are correct! Can you please write that as an answer to accept it. Big thanks for your help and attention to details! – mnabil Oct 31 '19 at 09:33
  • I have added the answer. Please check it. – zihaozhihao Oct 31 '19 at 16:14

1 Answers1

1

Your input is (7,7,3), kernel size is (3,3) and stride is (2,2). So if you do not want any paddings, (state in your comment), you should use padding="VALID", that will return a (3,3) tensor as output. If you use padding="SAME", it will return (4,4) tensor.

Usually, the formula of calculating output size for SAME pad is:

out_size = ceil(in_sizei/stride)

For VALID pad is:

out_size = ceil(in_size-filter_size+1/stride)
zihaozhihao
  • 4,197
  • 2
  • 15
  • 25