How to Visualize each patches and combine back to an image in Vision transformer features maps

Question

Running vision transformer keras code but Trying to visualize variable "features" encoded patch image is stored? How to visualize each patches?

def create_vit_classifier():
    inputs = layers.Input(shape=input_shape)
    # Augment data.
    augmented = data_augmentation(inputs)
    # Create patches.
    patches = Patches(patch_size)(augmented)
    # Encode patches.
    encoded_patches = PatchEncoder(num_patches, projection_dim)(patches)

    # Create multiple layers of the Transformer block.
    for _ in range(transformer_layers):
        # Layer normalization 1.
        x1 = layers.LayerNormalization(epsilon=1e-6)(encoded_patches)
        # Create a multi-head attention layer.
        attention_output = layers.MultiHeadAttention(
            num_heads=num_heads, key_dim=projection_dim, dropout=0.1
        )(x1, x1)
        # Skip connection 1.
        x2 = layers.Add()([attention_output, encoded_patches])
        # Layer normalization 2.
        x3 = layers.LayerNormalization(epsilon=1e-6)(x2)

        # MLP.
        x3 = mlp(x3, hidden_units=transformer_units, dropout_rate=0.1)
        # Skip connection 2.
        encoded_patches = layers.Add()([x3, x2])

    # Create a [batch_size, projection_dim] tensor.
    representation = layers.LayerNormalization(epsilon=1e-6)(encoded_patches)
    representation = layers.Flatten()(representation)
    representation = layers.Dropout(0.5)(representation)
    # Add MLP.
    features = mlp(representation, hidden_units=mlp_head_units, dropout_rate=0.5)
    
    # Classify outputs.
    logits = layers.Dense(num_classes)(features)
    # Create the Keras model.
    model = keras.Model(inputs=inputs, outputs=logits)
    return model

From this function I have to visualize the features of one single patch followed by all patches features in array matrix? Can any of you suggest me how to visualize the feature map after training a model?

To be more exact, after running the Keras vision transformer code at the last module, I need to visualize texture features. Initially, the image partitioned into small patches of size 16*16 using function Patches. Using attention map, texture features are extracted and trained using MLP classifier using variable features which is of 1D dimension vector. I want to visualize the features from the last layer for each patch?

How can I visualize that from the above code?

How to Visualize each patches and combine back to an image in Vision transformer features maps

0 Answers0