2

I am trying to teach a double pendulum fully actuated to perform a swing-up maneuver and keep the position if it reaches it. I previously trained a single pendulum with DQN and it learned the policy well in just a couple of hours of training. Now the problem is that I have a robot with two joints, so two actions must be chosen at the same time: the torque for the first one and the torque for the second one. The only method that comes to me is to generate the Q value for each possible action pair. The problem with this is that the action space grows exponentially with the joint space. Is there any other way to solve the problem?

0 Answers0