-1

Running the below code in Colab. Have two separate instances running (separate files). In one instance the code works, in the other, it does not. In the case where it's not working, the funciton np.cumsum() appears to be returning an array twice as long as the input array, which is creating a ValueError: "operands could not be broadcast together with shapes (2000,) (1000,)".
Can't figure out why it is happening, or how it's even possible. Also can't find any answers online (or even a similar instance of the same problem), so any help would be greatly appreciated!!!

'''

def b2_run_advanced_strategies_experiment(env_name='BanditTwoArmedUniform-v0'):
    results = {}
    experiments = [
        # baseline strategies 
        lambda env: pure_exploitation(env, N_Episodes),
        lambda env: pure_exploration(env, N_Episodes),

    ]
    for env_seed in tqdm(SEEDS, desc='All experiments'):
        env = gym.make(env_name, seed=env_seed) ; env.reset()
        true_Q = np.array(env.env.p_dist * env.env.r_dist)
        opt_V = np.max(true_Q)
        for seed in tqdm(SEEDS, desc='All environments', leave=False):
            for experiment in tqdm(experiments, 
                                   desc='Experiments with seed {}'.format(seed), 
                                   leave=False):
                env.seed(seed) ; np.random.seed(seed) ; random.seed(seed)
                name, Re, Qe, Ae = experiment(env)
                Ae = np.expand_dims(Ae, -1)                
                print("len of Re=",len(Re)) # RESULT GIVES 1000
                print("len of cumsum= ",len(np.cumsum(Re))) # RESULT GIVES 2000, HOW IS THAT POSSIBLE???
                
                episode_mean_rew = np.cumsum(Re) / (np.arange(len(Re)) + 1) # ERROR ON THIS LINE

                Q_selected = np.take_along_axis(
                    np.tile(true_Q, Ae.shape), Ae, axis=1).squeeze()
                regret = opt_V - Q_selected
                cum_regret = np.cumsum(regret)
                if name not in results.keys(): results[name] = {}
                if 'Re' not in results[name].keys(): results[name]['Re'] = []
                if 'Qe' not in results[name].keys(): results[name]['Qe'] = []
                if 'Ae' not in results[name].keys(): results[name]['Ae'] = []
                if 'cum_regret' not in results[name].keys(): 
                    results[name]['cum_regret'] = []
                if 'episode_mean_rew' not in results[name].keys(): 
                    results[name]['episode_mean_rew'] = []

                results[name]['Re'].append(Re)
                results[name]['Qe'].append(Qe)
                results[name]['Ae'].append(Ae)
                results[name]['cum_regret'].append(cum_regret)
                results[name]['episode_mean_rew'].append(episode_mean_rew)
    return results

b2_results_a = b2_run_advanced_strategies_experiment()

'''

1 Answers1

0

Your array is likely multi-dimensional:

eye3 = np.eye(3)
print(len(eye3)) # 3 (3 rows)
print(len(np.cumsum(eye3))) # 9 (3 rows * 3 columns = 9 elements once flattened)
print(len(np.cumsum(eye3, axis=1))) # 3 (3 rows)
Julien
  • 13,986
  • 5
  • 29
  • 53