Implementing residual block

Question

In the traditional residual block, is the "addition" of layer N to the output of layer N+2 (prior to non-linearity) element-wise addition or concatenation?

The literature indicates something like this:

X1 = X
X2 = relu(conv(X1))
X3 = conv(X2)
X4 = relu(conv(X3 + X1))

https://stackoverflow.com/q/46902386/712995 – Maxim Dec 22 '17 at 15:22 — Maxim, Dec 22 '17 at 15:22

score 1 · Answer 1 · answered Dec 22 '17 at 09:02

1

It has to be element-wise, with concatenation you don't get a residual function. One has also to be aware about using the proper padding mode so convolutions produce outputs with the same spatial dimensions as the block input.

answered Dec 22 '17 at 09:02

Dr. Snoopy

55,122
7
121
140

Thanks for that explanation. Since the convolutions must retain dimensions, does the input ever change size as it flows through the net? – rodrigo-silveira Dec 22 '17 at 14:56
@rodrigo-silveira It does through some blocks between the residual ones. Read the ResNet paper for more information about their architecture. – Dr. Snoopy Dec 22 '17 at 20:35

Implementing residual block

1 Answers1