0

Usually when we do nn.Layernorm, we want to find the norm across that whole dimension, otherwise I think it would be called group norm. Why in the pytorch documents, they use LayerNorm like this?

>>> # Image Example
>>> N, C, H, W = 20, 5, 10, 10
>>> input = torch.randn(N, C, H, W)
>>> # Normalize over the last three dimensions (i.e. the channel and spatial dimensions)
>>> # as shown in the image below
>>> layer_norm = nn.LayerNorm([C, H, W])
>>> output = layer_norm(input)

What happens if I do not know the dimension size (C,H,W)? How do I define a LayerNorm layer?

JobHunter69
  • 1,706
  • 5
  • 25
  • 49

1 Answers1

0

You can refer to this post by wandb, they create a function to pre-calculate the image size during initialization and pass it to layerNorm.

But as mentioned here, it is not recommended to use LayerNorm with CNNs.

  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community May 13 '23 at 11:06