Implementing 2D sliding window in Tensorflow

Question

I have a 3-dim shape tensor and I'm trying to transverse it using 2D sliding window as illustrated below:

in this image, each letter represents an n-elements array and the window size is 3x3. The window is always squared such as 3x3, 5x5, etc

I'm failing to find a way to implement this without numpy/loops. My object is using only tensorflow vectorized operations. Any ideas?

tf.image.extract_patches(images=image, sizes=[1, 3, 3, 1], strides=[1, 1, 1, 1], rates=[1, 1, 1, 1], padding='VALID') — Vijay Mariappan, Nov 25 '22 at 15:47

Mohammad Ahmed · Accepted Answer · 2022-11-26T19:10:08.800

Let suppose creating a Matrix m with size n*n

 m =[
    ['a' , 'b' , 'c' , 'd' , 'e'],
    ['f' , 'g' , 'h' , 'i' , 'j'],
    ['k' , 'l' , 'm' , 'n' , 'o'],
    ['p', 'q' , 'r' , 's' ,  't'],
    ['u' , 'v' , 'w' , 'y' , 'x']
]

def conv_slide_window(matrix_len , pad_size, stride):

    matrix = tf.reshape(tf.range(matrix_len**2)+1, (matrix_len , matrix_len))

    conv_window = (len(matrix) - pad_size)
    assert conv_window%stride==0 , "Please choose a stride which can be divisible by the convolution window" 
    conv_window = conv_window//stride + 1

    conv_window = conv_window **2

    image = tf.image.extract_patches(images=matrix[None,...,None], 
                         sizes=[1, pad_size, pad_size, 1], 
                         strides=[1, stride, stride, 1], 
                         rates=[1, 1, 1, 1], padding='VALID').numpy().reshape(-1,1).tolist()

    return tf.squeeze(tf.reshape(tokenize.sequences_to_texts(image) , (pad_size , pad_size , conv_window))) if pad_size >= conv_window else tf.squeeze(tf.split(tf.reshape(tokenize.sequences_to_texts(image) , (pad_size , pad_size , conv_window)) , conv_window , axis=-1))

#First do some pre-processing
#Define Tokenizer to tokenize the alphabets first you cannot directly map the alphabets
tokenize = tf.keras.preprocessing.text.Tokenizer()
tokenize.fit_on_sequences(m)
tokenize.fit_on_texts(m)
    
pad_size = 3
#can also use the stride, in your case the stride is 1.
stride = 1
conv_slide_window(len(m) , pad_size, stride)

<tf.Tensor: shape=(9, 3, 3), dtype=string, numpy=
array([[[b'a', b'b', b'c'],
        [b'f', b'g', b'h'],
        [b'k', b'l', b'm']],

       [[b'b', b'c', b'd'],
        [b'g', b'h', b'i'],
        [b'l', b'm', b'n']],

       [[b'c', b'd', b'e'],
        [b'h', b'i', b'j'],
        [b'm', b'n', b'o']],

       [[b'f', b'g', b'h'],
        [b'k', b'l', b'm'],
        [b'p', b'q', b'r']],

       [[b'g', b'h', b'i'],
        [b'l', b'm', b'n'],
        [b'q', b'r', b's']],

       [[b'h', b'i', b'j'],
        [b'm', b'n', b'o'],
        [b'r', b's', b't']],

       [[b'k', b'l', b'm'],
        [b'p', b'q', b'r'],
        [b'u', b'v', b'w']],

       [[b'l', b'm', b'n'],
        [b'q', b'r', b's'],
        [b'v', b'w', b'y']],

       [[b'm', b'n', b'o'],
        [b'r', b's', b't'],
        [b'w', b'y', b'x']]], dtype=object)>

Implementing 2D sliding window in Tensorflow

1 Answers1