Questions tagged [deep-residual-networks]

A residual neural network is a class of deep, feed-forward artificial neural networks that utilizes skip connections or short-cuts to jump over some layers in order to make the optimization of very deep networks tractable.

A is an artificial that utilizes skip connections or short-cuts to jump over some layers.

The motivation for skipping over layers is to avoid the problem of vanishing gradients that might occur in very deep neural networks.

By reusing activation from a previous layer until the layer next to the current one have learned its weights, the neural network collapses into fewer layers in the initial phase and gradually expands as it learns more of the feature space, thus making the optimization of very deep networks tractable.

Additional resources:

57 questions
11
votes
2 answers

How to count the amount of layers in a CNN?

The Pytorch implementation of ResNet-18. has the following structure, which appears to be 54 layers, not 18. So why is it called "18"? How many layers does it actually have? ResNet ( (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2),…
10
votes
1 answer

What is the idea behind using nn.Identity for residual learning?

So, I've read about half the original ResNet paper, and am trying to figure out how to make my version for tabular data. I've read a few blog posts on how it works in PyTorch, and I see heavy use of nn.Identity(). Now, the paper also frequently uses…
rocksNwaves
  • 5,331
  • 4
  • 38
  • 77
9
votes
3 answers

What is "linear projection" in convolutional neural network

I am reading through Residual learning, and I have a question. What is "linear projection" mentioned in 3.2? Looks pretty simple once got this but could not get the idea... Can someone provide simple example?
8
votes
3 answers

Tensorflow: How to set the learning rate in log scale and some Tensorflow questions

I am a deep learning and Tensorflow beginner and I am trying to implement the algorithm in this paper using Tensorflow. This paper uses Matconvnet+Matlab to implement it, and I am curious if Tensorflow has the equivalent functions to achieve the…
chesschi
  • 666
  • 1
  • 8
  • 36
7
votes
4 answers

Does it make sense to build a residual network with only fully connected layers (instedad of convolutional layers)?

Residual networks are always built with convolutional layers. I have never seen residual networks with only fully connected layers. Does it work to build a residual network with only fully connected layers?
7
votes
2 answers

Residual Neural Network: Concatenation or Element Addition?

With the residual block in residual neural networks, is the addition at the end of the block true element addition or is it concatenation? For example, would addition([1, 2], [3, 4]) produce [1, 2, 3, 4] or [4, 6] ?
C. R.
  • 77
  • 1
  • 2
7
votes
1 answer

Is it possible to have non-trainable layer in Keras?

I would like to calculate constant convolution like blurring or resampling and want it never change during training. Can I initialize convolution kernel to constant and exclude it from training in Keras? More specifically, I don't want to use this…
Dims
  • 47,675
  • 117
  • 331
  • 600
5
votes
1 answer

Intuition on Deep Residual Network

I was reading the Deep Residual Network paper and in the paper there is a concept that I cannot fully understand: Question: What does it mean by "hope the 2 weight layers fit F(x)" ? Here F(x) is processing x with two weight layers(+ ReLu…
Johnnylin
  • 507
  • 2
  • 7
  • 26
3
votes
1 answer

How to add skip connection between convolutional layers in Keras

I would like to add a skip connection between residual blocks in keras. This is my current implementation, which does not work because the tensors have different shapes. The function looks like this: def build_res_blocks(net, x_in, num_res_blocks,…
atlas
  • 411
  • 7
  • 14
3
votes
0 answers

How to set batch-normalization op in inference mode without calling tf.layers.batch_normalization() ?

I define a deep CNN with tensorflow, inluding a batch-normalization op, i.e, my code may look like this: def network(input): ... input = tf.layers.batch_normalization(input, ...) ... Assume the network has been trained, and the…
3
votes
1 answer

Residual learning in tensorflow

I am attempting to replicate this image from a research paper. In the image, the orange arrow indicates a shortcut using residual learning and the layer outlined in red indicates a dilated convolution. In the code below, r5 is the result of the…
3
votes
1 answer

Clarification on NN residual layer back-prop derivation

I've looked everywhere and can't find anything that explains the actual derivation of backprop for residual layers. Here's my best attempt and where I'm stuck. It is worth mentioning that the derivation that I'm hoping for is from a generic…
3
votes
0 answers

Accuracy gets worse the longer I train A Keras Model

I'm currently using a resnet built in keras to do two class classification. I am using model checkpoint to save the best models based off of validation accuracy. Better and better models are saved until I go through all my datapoints a few times.…
1
vote
1 answer

VGG16 Custom Activation Function used in ResNet function

Here's my code: import os os.environ["TF_CPP_MIN_LOG_LEVEL"] = "2" import tensorflow as tf from tensorflow import keras from keras import layers from keras.datasets import cifar10 from sklearn.model_selection import train_test_split import numpy as…
1
vote
0 answers

Why does the accuracy fluctuate widely after using batch normalization

I'm training a model which includes batch normalization layer, but i noticed that the accuracy can fluctuate widely (from 55% to 31% in just one epoch), both train accuracy and test accuracy, so i think it's not caused by overfitting. This is my…
1
2 3 4