I'm currently working on a trajectory optimization problem that involves binary actuators. In order to avoid solving an MINLP I do not simply optimize over the states and control inputs but instead, assume that each of the binary actuators alternates between the states "on" and "off" and optimize over the duration of those intervals. I will call the array containing the decision variable h (an N by 2 matrix in the particular case below).
A minimal example using a double integrator that has two control inputs that enact a positive or negative force on the system respectively would be:
Here I model the state trajectory as some train of 3rd order polynomial. I particularly do not want to merge these actuators into one with the states -1,0,1 since the more general system I'd like to apply this to also has more binary actuators. I add some additional constraints such as connecting the polynomials continuously and differentiably; enforce that the sum of all intervals is equal to the desired final time; enforce initial and final state constraints and finally enforce the dynamics of the system. My initial idea was to then enforce the dynamics at constant intervals, i.e.:
However, the issue here is that any of the actuators could be in any arbitrary interval for some time t. Since the intervals can shrink to duration zero one actuator might be in the last interval while the other one remains in the first. I.e. the value of a decision variable (time duration) changes which decision variables are dependent on each other. This is more or less manifested in drake by the fact that I cannot do a comparison such as Tau < t if Tau is a drake expression and t is some number. The code snippet is:
# Enforce dynamics at the end of each control interval
for t in np.arange(0, Tf, dt_dyn):
# Find the index of the interval that is active for each actuator
t_ctrl = np.cumsum(h, axis=0)
intervals = (t_ctrl < t)
idxs = np.sum(intervals, axis=0)
# If the idx is even the actuator is off, otherwise its on
prog.AddConstraint(eq(qdd(q_a, t, dt_state),
continuous_dynamics(q(q_a, t, dt_state),
qd(q_a, t, dt_state),
[idxs[0] % 2, idxs[1] % 2])))
and the resulting error message:
Traceback (most recent call last):
File "test.py", line 92, in <module>
intervals = (t_ctrl < t)
RuntimeError: You should not call `__bool__` / `__nonzero__` on `Formula`. If you are trying to make a map with `Variable`, `Expression`, or `Polynomial` as keys (and then access the map in Python), please use pydrake.common.containers.EqualToDict`.
In the end, my question is more of a conceptual than technical nature: Does drake support this dependence of the "dependence of decision variables on the values" in some other way? Or is there a different way I can transcribe my problem to avoid the ambiguity in which interval my actuators are for some given time?
I've also linked to the overall script that implements the minimal example here