RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation in PyTorch

Question

I am trying to train an actor-critic network for a multi-agent DDPG network (MADDPG) for 10000 episodes with 25 time steps for each episode. When I start the training, after ten episodes, I get this error for computing the gradients.

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [64, 100]], which is output 0 of AsStridedBackward0, is at version 3; expected version 2 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

My code for computing gradients and updating the model is as follows.

        for agent_idx, agent in enumerate(self.agents):
            # torch.autograd.set_detect_anomaly(True)
            critic_value_ = agent.target_critic.forward(states_, new_actions).flatten()
            critic_value_[dones[:, 0]] = 0.0
            critic_value = agent.critic.forward(states, old_actions).flatten()

            target = rewards[:, agent_idx] + (agent.gamma * critic_value_)
            critic_loss = F.mse_loss(target, critic_value)
            agent.critic.optimizer.zero_grad()
            critic_loss.backward(retain_graph= True)
            agent.critic.optimizer.step()

            actor_loss = agent.critic.forward(states, mu).flatten()
            actor_loss = -torch.mean(actor_loss)
            agent.actor.optimizer.zero_grad()
            actor_loss.backward(retain_graph= True)
            agent.actor.optimizer.step()

            agent.update_network_parameters()

I am using PyTorch version 1.13.1+cu116. How can I solve this issue?

Did you follow the instructions in the exception message? What were the results? — jodag, Mar 18 '23 at 13:31

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation in PyTorch

0 Answers0