1

I am currently trying to train a simple feed forward neural net to solve the very simple differential equation dy/dx = 2 on [-1,1]

If we consider the neural net to be the function NN(x), I have set my loss to be MSE(NN'(x) - 2, 0), but I am having trouble computing the gradient of the function NN(x) using autograd.

My main idea is that I have partitioned the interval into 100 subintervals:

X = torch.linspace(-1,1,N) #[-1, -0.98, -0.96...0.98, 1] #where N = 101

I then feed these elements -1,-0.98, .... into the neural net and then use the following code to check the gradients:

for epoch in range(num_epochs):
    for j in range(N):
        input = torch.tensor([float(X[j])], requires_grad=True) 
        
        output = model(input)
        output.backward()
        loss = criterion(input.grad, torch.tensor([2.0], requires_grad=True))
        
        loss.backward()
        optimiser.zero_grad()  
        optimiser.step()

The issue is that I am going backwards through the graph twice (output.backward, loss.backward) but I can't figure out another way to compute the numerical derivative of dNN(x)/dx.

Is there another way to go about this? It's my first SO post so please be kind :)

mimo
  • 11
  • 2

1 Answers1

-2

It seems that you are on the right track with your approach, but there are a few things that need to be adjusted.

First, when defining your input tensor, you should specify the requires_grad flag for each element in your tensor, not just for the input tensor itself. Here's how you can modify it:

input = torch.tensor([float(X[j])], requires_grad=True)

Next, you are calling output.backward() before computing the loss. This is unnecessary because you only need to compute the gradients with respect to the parameters of your model, not with respect to the output. So you can remove the line output.backward().

Regarding the loss computation, it seems like you want to calculate the mean squared error between the derivative of your model's output (NN'(x)) and the value 2. However, you are passing input.grad as the predicted derivative (NN'(x)), which is incorrect. Instead, you should calculate the derivative of your model's output manually using torch.autograd.grad. Here's how you can modify the loss computation:

output = model(input)
predicted_derivative = torch.autograd.grad(output, input, grad_outputs=torch.ones_like(output), create_graph=True)[0]
loss = criterion(predicted_derivative, torch.tensor([2.0]))

Note that torch.autograd.grad computes the gradients of output with respect to input. By passing grad_outputs=torch.ones_like(output) as the grad_outputs argument, we indicate that the gradients should be multiplied by 1, effectively calculating the first derivative. The create_graph=True flag is used to create a computational graph for higher-order derivatives if needed.

Finally, you should call loss.backward() to compute the gradients with respect to the model's parameters before updating them with the optimizer.

Here's the modified code with the above changes:

for epoch in range(num_epochs):
    for j in range(N):
        input = torch.tensor([float(X[j])], requires_grad=True) 

        output = model(input)
        predicted_derivative = torch.autograd.grad(output, input, grad_outputs=torch.ones_like(output), create_graph=True)[0]
        loss = criterion(predicted_derivative, torch.tensor([2.0]))
        
        loss.backward()
        optimizer.zero_grad()  
        optimizer.step()

With these modifications, you should be able to compute the gradients and train your feed-forward neural network to solve the given differential equation. Remember to adjust the learning rate, optimizer, and other hyperparameters as needed for your specific problem.

  • This answer looks like it was generated by an AI (like ChatGPT), not by an actual human being. You should be aware that [posting AI-generated output is officially **BANNED** on Stack Overflow](https://meta.stackoverflow.com/q/421831). If this answer was indeed generated by an AI, then I strongly suggest you delete it before you get yourself into even bigger trouble: **WE TAKE PLAGIARISM SERIOUSLY HERE.** Please read: [Why posting GPT and ChatGPT generated answers is not currently acceptable](https://stackoverflow.com/help/gpt-policy). – tchrist Jul 05 '23 at 13:39
  • It seems that there is still no convergence even with the above fix. The predicted derivative never seems to go very far from the zero vector even with loss.backward(). Surely the loss.backward() line would train the net to minimise the loss? – mimo Jul 06 '23 at 02:16
  • As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Jul 15 '23 at 21:22