Confused about implementation of skip layers in CNN

Question

I'm reading about AlphaGo Zero's network structure and came across this cheatsheet:

I'm having a hard time understanding how skip connections work dimensionally.

Specifically, it seems like each residual layer ends up with 2 stacked copies of the input it receives. Would this not cause the input size to grow exponentially with the depth of the network?

And could this be avoided by changing the output channel size of the conv2d filter? I see that in_C and out_C don't have to be the same in pytorch, but I don't know enough to understand the implications of these values being different.

I’m voting to close this question because it is not about programming as defined in the [help] but about ML theory and/or methodology - please see the intro and NOTE in the `machine-learning` [tag info](https://stackoverflow.com/tags/machine-learning/info). — desertnaut, Aug 23 '21 at 07:58

score 0 · Accepted Answer · answered Aug 23 '21 at 07:54

With skip connection, you can indeed end up with twice the number of channels per connection. This is the case when you are concatenating the channels together. However, it doesn't necessarily have to grow exponentially, if you keep the number of output channels (what you refer to as out_C) under control.

For instance, if you have a skip connection providing a total of n channels and the convolutional layer gets in_C channels as input. Then you can define out_C as n as well, such that the resulting number of channels after concatenation is equal to 2*n. Ultimately, you decide on the number of output channels for each convolution, it is all about network capacity and how much it will be capable of learning.

Thanks! that was my suspicion but difficult for me to confirm. — TheDarBear, Aug 24 '21 at 02:03

Confused about implementation of skip layers in CNN

1 Answers1