1

I am working on designing an experiment, where a system can receive 5 inputs [x1, x2, x3, x4, x5] and record target data associated with this "state". Given a new input state, the system will then transition from the old to the new state with a certain speed (these are constants) and record the target value along the path. Given the constraint in total time the system can run, what is the best subset of states that can be selected and the optimal trajectory to maximize the total diversity of data points generated?

The following are the variables we use:

x1 = np.arange(5, 37, 2)
x2 = np.arange(5, 53, 3)
x3 = np.arange(0, 3, 1)
x4 = np.arange(0, 4.3, 0.3)
x5 = np.arange(-3, 3.2, 0.2)
  • There are a total of 357120 possible state combinations
  • Total experiment time cannot exceed 20 minutes (or 1200 seconds)
  • When a new state is reached, the system remains stationary for 5 seconds before beginning to move to a new state
  • The following are the rates of change (per 1/10 of a second) for each variable:
x1_speed = 0.1
x2_speed = 0.1
x3_speed = 0.005 
x4_speed = 0.13
x5_speed = 0.1

The best strategy I could come up with so far is to randomly initialize subset of states within the constraints of total time allowed, calculate total Euclidian distance between all points in the data and record the subset of states that maximizes that distance as measure of diversity of data points. However, inspecting some interactions in 2D plane still leaves chunks of space not mapped (per picture below).

Result of random state generation & optimization for Euclidian distance

I've looked into path optimization problems, but given the number of possible states * transitions to simulate all possible edges between all possible nodes seemed like wrong approach. Given that the points selected for experiment have to be optimized and are not pre-defined complicates things as well. If it is a diversity maximization LP problem with some constraints, how do you make it recursive to map out the whole path? Alternatively, was reading more about Reinforcement Learning approach.

The code I used to accomplish the above is as follows. I skipped the part where I record the states, but there's extra lines to populate states_df, which then I use to calculate total Euclidian distance. I can add it if necessary, but please let me know if there is a better strategy/algorithm to solve this problem given some of the constraints, thanks!


import numpy as np
import pandas as pd
from scipy.spatial.distance import pdist

def state_gen():

    var1 = np.random.choice(x1)
    var2 = np.random.choice(x2)
    var3 = round(np.random.choice(x3), 1)
    var4 = round(np.random.choice(x4), 1)
    var5 = round(np.random.choice(x5), 1)

    state = [var1, var2, var3, var4, var5]
    return state


def state_change(state_old):

    state_new = state_gen()

    #x1
    x1_travel = state_old[0] - state_new[0]
    x1_time = abs(x1_travel/x1_speed) #in seconds

    #x2
    x2_travel = state_old[1] - state_new[1]
    x2_time = abs(x2_travel/x2_speed) #in seconds

    #x3
    x3_travel = state_old[2] - state_new[2]
    x3_time = abs(x3_travel/x3_speed) #in seconds

    #x4
    x4_travel = state_old[3] - state_new[3]
    x4_time = abs(x4_travel/x4_speed) #in seconds

    #x5
    x5_travel = state_old[4] - state_new[4]
    x5_time = abs(x5_travel/x5_speed) #in seconds

    #Record values to **states_df** here, add to return statement

    t_change = max(x1_time, x2_time, x3_time, x4_time, x5_time)
    return state_new, t_change


def objective():

    #Initialize first state
    state_old = state_gen()

    t_tot = 0

    while t_tot < 11500: #Little under 20 minutes in tenths of seconds

        state_new, t_change, = state_change(state_old)
        
        #System stationary for 5 seconds
        t_change += 50
        t_tot += t_change
        state_init = state_new
    
    tot_euc = pdist(states_df.values, metric='euclidean').sum()
    return tot_euc, tot_time, states_df



iter_nums = 250000
tot_euc_max = 0

for i in range(0, iter_nums):
    tot_euc, tot_time, states_df = objective()
    if tot_euc > tot_euc_max:
        tot_euc_max = tot_euc

print('Maximum {} attained'.format(tot_euc_max))
print('Total run time: {}'.format(tot_time))

Ram
  • 131
  • 8
tishafok
  • 13
  • 4
  • Do the input variables depend on each other? Does the new state of the system depend on the previous one? A real system tends to be more often in states limited to some area, do you have such statistics? – Сергей Кох Dec 19 '22 at 20:03
  • Thanks Sergei, the system state is only bounded by the range of variable values, and it can transition from any 1 state to another (I guess only constraint here is that exact states should not repeat). The goal of this exercise is to cover the entire 5D space as evenly as possible so that target data associated with each point can be collected, otherwise that "map" will be missing some states. The variables are not independent in reality, but the entire space needs to be mapped out. – tishafok Dec 20 '22 at 03:21

0 Answers0