Questions tagged [backpropagation]

Backpropagation is a method of the gradient computation, often used in artificial neural networks to perform gradient descent.

Backpropagation is a method of the gradient computation, often used in artificial neural networks to perform gradient descent. It led to a “renaissance” in the field of artificial neural network research.

In most cases, it requires a teacher that knows, or can calculate, the desired output for any input in the training set. The term is an abbreviation for "backward propagation of errors".

1267 questions
10
votes
1 answer

Compute gradients for each time step of tf.while_loop

Given a TensorFlow tf.while_loop, how can I calculate the gradient of x_out with respect to all weights of the network for each time step? network_input = tf.placeholder(tf.float32, [None]) steps = tf.constant(0.0) weight_0 =…
Genius
  • 569
  • 1
  • 3
  • 23
10
votes
1 answer

How calculating hessian works for Neural Network learning

Can anyone explain to me in a easy and less mathematical way what is a Hessian and how does it work in practice when optimizing the learning process for a neural network ?
10
votes
1 answer

Looping through training data in Neural Networks Backpropagation Algorithm

How many times do I use a sample of training data in one training cycle? Say I have 60 training data. I go through the 1st row and do a forward pass and adjust weights using results from backward pass. Using the sigmoidal function as below: Forward…
10
votes
2 answers

Neural Network Architecture Design

I'm playing around with Neural Networks trying to understand the best practices for designing their architecture based on the kind of problem you need to solve. I generated a very simple data set composed of a single convex region as you can see…
Matteo
  • 7,924
  • 24
  • 84
  • 129
10
votes
2 answers

Can somebody please explain the backpropagation algorithm to me?

I've recently completed Professor Ng's Machine Learning course on Coursera, and while I loved the entire course, I never really managed to understand the backpropagation algorithm for training neural networks. My problem with understanding it is, he…
Jonathon Ashworth
  • 1,182
  • 10
  • 20
9
votes
3 answers

What Loss Or Reward Is Backpropagated In Policy Gradients For Reinforcement Learning?

I have made a small script in Python to solve various Gym environments with policy gradients. import gym, os import numpy as np #create environment env = gym.make('Cartpole-v0') env.reset() s_size = len(env.reset()) a_size = 2 #import my neural…
9
votes
1 answer

Pytorch Autograd gives different gradients when using .clamp instead of torch.relu

I'm still working on my understanding of the PyTorch autograd system. One thing I'm struggling at is to understand why .clamp(min=0) and nn.functional.relu() seem to have different backward passes. It's especially confusing as .clamp is used…
DaFlooo
  • 91
  • 1
  • 6
9
votes
1 answer

Pytorch: How to create an update rule that doesn't come from derivatives?

I want to implement the following algorithm, taken from this book, section 13.6: I don't understand how to implement the update rule in pytorch (the rule for w is quite similar to that of theta). As far as I know, torch requires a loss for…
Gulzar
  • 23,452
  • 27
  • 113
  • 201
9
votes
1 answer

Neural nets for ruby

Which libraries/plugins are the best(fast/well-documented/etc) for designing and creating neural nets with backpropgation? Googling Ai4r Ai-Appp
9
votes
1 answer

How does tensorflow handle non differentiable nodes during gradient calculation?

I understood the concept of automatic differentiation, but couldn't find any explanation how tensorflow calculates the error gradient for non differentiable functions as for example tf.where in my loss function or tf.cond in my graph. It works just…
Natjo
  • 2,005
  • 29
  • 75
9
votes
2 answers

Truncated Backpropagation in keras with one sequence per batch

If I understood correctly, to perform TBPTT in keras we have to split our sequences into smaller parts of k timesteps. To re-use the state of our LSTM accross all the parts of the sequence we have to use the stateful parameter, according to the…
François MENTEC
  • 1,150
  • 4
  • 12
  • 25
9
votes
1 answer

How does a convolution kernel get trained in a CNN?

In a CNN, the convolution operation 'convolves' a kernel matrix over an input matrix. Now, I know how a fully connected layer makes use of gradient descent and backpropagation to get trained. But how does the kernel matrix change over time? There…
9
votes
1 answer

Is column selection in pytorch differentiable?

Is column selection in Pytorch differentiable? for eg if I want to select a single column from each row to make a new row X 1 array and then backdrop using this new array, will the backdrop work properly? qvalues = qvalues[range(5),[0,1,0,1,0]] if…
patrick
  • 91
  • 1
  • 2
9
votes
1 answer

Returning mutiple values in the input function for `tf.py_func`

I'm trying to set custom gradients using tf.py_func and tf.RegisterGradient. Specifically, I'm trying to take a gradient of an eigen value w.r.t its Laplacian. I got the basic thing working, where my python function returns one value, which is the…
alpaca
  • 1,211
  • 13
  • 23
9
votes
2 answers

Why scaling data is very important in neural network(LSTM)

I am writing my master thesis about how to apply LSTM neural network in time series. In my experiment, i found out that scaling data can have a great impact on the result. For example, when i use a tanh activation function, and the value range is…
Thanh Quang
  • 193
  • 1
  • 11