Obtaining different set of configs across multiple calls in ray tune

Question

I am trying to make my code reproducible. I have already added np.random.seed(...) and random.seed(...), and at the moment I am not using pytorch or tf, therefore no scheduler or searcher can introduce any random issue. The set of configs produced with the above code should be always the same across multiple calls. However, it is not the case.

Can anyone help with this?

Thank you!

Here the code:

import ray
from ray import tune
import random
import numpy as np

def training_function(config, data_init):
    print('CONFIG:', config)
    tune.report(end_of_training=1, acc=0, f=0)

if __name__ == '__main__':
    ray.init(num_cpus=12)
    tune_config = {'sentence_classification': False, 
              'norm_word_emb': tune.choice(['True', 'False']), 
              'use_crf': tune.choice(['True', 'False']), 
              'use_char': tune.choice(['True', 'False']), 
              'word_seq_feature': tune.choice(['CNN', 'LSTM', 'GRU']), 
              'char_seq_feature': tune.choice(['CNN', 'LSTM', 'GRU']), 
              'seed_num': 1267}
    data = {'a': 1}
    tune_seed = tune_config['seed_num']
    random.seed(tune_seed)
    np.random.seed(tune_seed)
    n_samples = 15
    exp_name = 'experiment_name'
    analysis = tune.run(
        tune.with_parameters(training_function, data_init={'data': data}),
        name=exp_name,
        metric="f",
        mode="max",
        queue_trials=True,
        config=tune_config,
        num_samples=n_samples,
        resources_per_trial={"cpu": 1},
        checkpoint_at_end=True,
        max_failures=0,
    )

ptyshevs · Answer 1 · 2021-01-08T07:45:52.520

Function-level API cannot be made reproducible (ray v1.1.0, may be subject to change).

Wait, but why

tune.run creates an Experiment object, passing your function there.
Experiment registers the function as trainable by calling register_trainable
register_trainable wraps your function using wrap_function
wrap_function will create a class-level API (ray Actor) by inheriting from FunctionRunner class.
FunctionRunner doesn't have any callback access into setup method.

The way Actor works is, oversimplifying, it gets distributed among workers and then initialized in different processes using setup method. This is why it is crutial to pass seed and implement initialization logic inside your custom Trainable, as described in this answer. Seeding is needed because tune.choice is just a wrapper around random/np.random functions. You can observe this in tune/sample.py.

See the example:


import ray
from ray import tune
import random
import numpy as np

class Tunable(tune.Trainable):
    def setup(self, config):
        self.config = config
        self.seed = config['seed_num']
        random.seed(self.seed)
        np.random.seed(self.seed)
    
    def step(self):
        print('CONFIG:', self.config)
        return {tune.result.DONE: 'done', 'acc': 0, 'f': 0}

if __name__ == '__main__':
    ray.init(num_cpus=12)
    tune_config = {'sentence_classification': False, 
              'norm_word_emb': tune.choice(['True', 'False']), 
              'use_crf': tune.choice(['True', 'False']), 
              'use_char': tune.choice(['True', 'False']), 
              'word_seq_feature': tune.choice(['CNN', 'LSTM', 'GRU']), 
              'char_seq_feature': tune.choice(['CNN', 'LSTM', 'GRU']), 
              'seed_num': 1267}
    data = {'a': 1}
    tune_seed = tune_config['seed_num']
    n_samples = 15
    exp_name = 'experiment_name'
    analysis = tune.run(
        Tunable,
        name=exp_name,
        metric="f",
        mode="max",
        queue_trials=True,
        config=tune_config,
        num_samples=n_samples,
        resources_per_trial={"cpu": 1},
        checkpoint_at_end=False,
        max_failures=0,
    )

Hi @ptyshevs, unfortunately, this does not solves my problem either. Notice that I am not trying to control the randomness during training, but during the config(s) generated. Two runs of this code generates different configurations (not just different orders of the configs, but different set of values). — Roxana, Jan 08 '21 at 14:03
I seems like a bug in Ray and a ticket has been raised: https://github.com/ray-project/ray/issues/13295 — Roxana, Jan 08 '21 at 14:51
I wonder if this is related only to categorical variable (tune.choice), since other things are using numpy.random module — ptyshevs, Jan 08 '21 at 16:32

score -1 · Answer 2 · answered Jan 07 '21 at 18:50

I'm seeing the behavior where the seeding works. I ran this script:

import ray
from ray import tune
import numpy as np
import random


def training_function(config, data_init):
    print('CONFIG:', config)
    tune.report(end_of_training=1, acc=0, f=0)

if __name__ == '__main__':
    # ray.init(num_cpus=12)
    tune_config = {'sentence_classification': False, 
              'norm_word_emb': tune.choice(['True', 'False']), 
              'use_crf': tune.choice(['True', 'False']), 
              'use_char': tune.choice(['True', 'False']), 
              'word_seq_feature': tune.choice(['CNN', 'LSTM', 'GRU']), 
              'char_seq_feature': tune.choice(['CNN', 'LSTM', 'GRU']), 
              'seed': 1267}
    data = {'a': 1}
    tune_seed = tune_config['seed']
    random.seed(tune_seed)
    np.random.seed(tune_seed)
    n_samples = 15
    analysis = tune.run(
        tune.with_parameters(training_function, data_init={'data': data}),
        #name=exp_name,
        metric="f",
        mode="max",
        queue_trials=True,
        config=tune_config,
        num_samples=n_samples,
        resources_per_trial={"cpu": 1},
        verbose=2,
        max_failures=0,
    )

where I ran one run:

Resources requested: 0/16 CPUs, 0/0 GPUs, 0.0/27.0 GiB heap, 0.0/9.28 GiB objects
Current best trial: 84b84_00014 with f=0 and parameters={'sentence_classification': False, 'norm_word_emb': 'False', 'use_crf': 'True', 'use_char': 'False', 'word_seq_feature': 'LSTM', 'char_seq_feature': 'GRU', 'seed': 1267}
Number of trials: 15/15 (15 TERMINATED)
+--------------------+------------+-------+--------------------+-----------------+------------+-----------+--------------------+--------+------------------+-------------------+-------+-----+
| Trial name         | status     | loc   | char_seq_feature   | norm_word_emb   | use_char   | use_crf   | word_seq_feature   |   iter |   total time (s) |   end_of_training |   acc |   f |
|--------------------+------------+-------+--------------------+-----------------+------------+-----------+--------------------+--------+------------------+-------------------+-------+-----|
| _inner_84b84_00000 | TERMINATED |       | LSTM               | True            | False      | False     | LSTM               |      1 |       0.00149202 |                 1 |     0 |   0 |
| _inner_84b84_00001 | TERMINATED |       | CNN                | False           | True       | False     | CNN                |      1 |       0.0014801  |                 1 |     0 |   0 |
| _inner_84b84_00002 | TERMINATED |       | GRU                | False           | False      | True      | GRU                |      1 |       0.00152397 |                 1 |     0 |   0 |
| _inner_84b84_00003 | TERMINATED |       | GRU                | False           | False      | False     | GRU                |      1 |       0.00165081 |                 1 |     0 |   0 |
| _inner_84b84_00004 | TERMINATED |       | CNN                | False           | False      | False     | CNN                |      1 |       0.00173998 |                 1 |     0 |   0 |
| _inner_84b84_00005 | TERMINATED |       | LSTM               | True            | True       | True      | CNN                |      1 |       0.00219083 |                 1 |     0 |   0 |
| _inner_84b84_00006 | TERMINATED |       | GRU                | True            | False      | False     | LSTM               |      1 |       0.00192428 |                 1 |     0 |   0 |
| _inner_84b84_00007 | TERMINATED |       | LSTM               | True            | False      | False     | CNN                |      1 |       0.00208902 |                 1 |     0 |   0 |
| _inner_84b84_00008 | TERMINATED |       | LSTM               | True            | True       | True      | GRU                |      1 |       0.00146484 |                 1 |     0 |   0 |
| _inner_84b84_00009 | TERMINATED |       | CNN                | False           | False      | True      | CNN                |      1 |       0.00152087 |                 1 |     0 |   0 |
| _inner_84b84_00010 | TERMINATED |       | LSTM               | False           | True       | False     | CNN                |      1 |       0.00124121 |                 1 |     0 |   0 |
| _inner_84b84_00011 | TERMINATED |       | LSTM               | True            | True       | True      | CNN                |      1 |       0.00124812 |                 1 |     0 |   0 |
| _inner_84b84_00012 | TERMINATED |       | LSTM               | True            | True       | True      | LSTM               |      1 |       0.00133514 |                 1 |     0 |   0 |
| _inner_84b84_00013 | TERMINATED |       | LSTM               | True            | False      | True      | CNN                |      1 |       0.00142407 |                 1 |     0 |   0 |
| _inner_84b84_00014 | TERMINATED |       | GRU                | False           | False      | True      | LSTM               |      1 |       0.00120211 |                 1 |     0 |   0 |
+--------------------+------------+-------+--------------------+-----------------+------------+-----------+--------------------+--------+------------------+-------------------+-------+-----+

and the subsequent run:

Current best trial: 84b84_00014 with f=0 and parameters={'sentence_classification': False, 'norm_word_emb': 'False', 'use_crf': 'True', 'use_char': 'False', 'word_seq_feature': 'LSTM', 'char_seq_feature': 'GRU', 'seed': 1267}
Result logdir: /Users/rliaw/ray_results/_inner_2021-01-07_10-45-31
Number of trials: 15/15 (15 TERMINATED)
+--------------------+------------+-------+--------------------+-----------------+------------+-----------+--------------------+--------+------------------+-------------------+-------+-----+
| Trial name         | status     | loc   | char_seq_feature   | norm_word_emb   | use_char   | use_crf   | word_seq_feature   |   iter |   total time (s) |   end_of_training |   acc |   f |
|--------------------+------------+-------+--------------------+-----------------+------------+-----------+--------------------+--------+------------------+-------------------+-------+-----|
| _inner_84b84_00000 | TERMINATED |       | LSTM               | True            | False      | False     | LSTM               |      1 |       0.00149202 |                 1 |     0 |   0 |
| _inner_84b84_00001 | TERMINATED |       | CNN                | False           | True       | False     | CNN                |      1 |       0.0014801  |                 1 |     0 |   0 |
| _inner_84b84_00002 | TERMINATED |       | GRU                | False           | False      | True      | GRU                |      1 |       0.00152397 |                 1 |     0 |   0 |
| _inner_84b84_00003 | TERMINATED |       | GRU                | False           | False      | False     | GRU                |      1 |       0.00165081 |                 1 |     0 |   0 |
| _inner_84b84_00004 | TERMINATED |       | CNN                | False           | False      | False     | CNN                |      1 |       0.00173998 |                 1 |     0 |   0 |
| _inner_84b84_00005 | TERMINATED |       | LSTM               | True            | True       | True      | CNN                |      1 |       0.00219083 |                 1 |     0 |   0 |
| _inner_84b84_00006 | TERMINATED |       | GRU                | True            | False      | False     | LSTM               |      1 |       0.00192428 |                 1 |     0 |   0 |
| _inner_84b84_00007 | TERMINATED |       | LSTM               | True            | False      | False     | CNN                |      1 |       0.00208902 |                 1 |     0 |   0 |
| _inner_84b84_00008 | TERMINATED |       | LSTM               | True            | True       | True      | GRU                |      1 |       0.00146484 |                 1 |     0 |   0 |
| _inner_84b84_00009 | TERMINATED |       | CNN                | False           | False      | True      | CNN                |      1 |       0.00152087 |                 1 |     0 |   0 |
| _inner_84b84_00010 | TERMINATED |       | LSTM               | False           | True       | False     | CNN                |      1 |       0.00124121 |                 1 |     0 |   0 |
| _inner_84b84_00011 | TERMINATED |       | LSTM               | True            | True       | True      | CNN                |      1 |       0.00124812 |                 1 |     0 |   0 |
| _inner_84b84_00012 | TERMINATED |       | LSTM               | True            | True       | True      | LSTM               |      1 |       0.00133514 |                 1 |     0 |   0 |
| _inner_84b84_00013 | TERMINATED |       | LSTM               | True            | False      | True      | CNN                |      1 |       0.00142407 |                 1 |     0 |   0 |
| _inner_84b84_00014 | TERMINATED |       | GRU                | False           | False      | True      | LSTM               |      1 |       0.00120211 |                 1 |     0 |   0 |
+--------------------+------------+-------+--------------------+-----------------+------------+-----------+--------------------+--------+------------------+-------------------+-------+-----+

Notice that the trials and their configs are exactly the same (in the same order).

If you execute with less number of cpus than 15, or if you stress the system with other processes, (so, the tasks are queued, etc.), and you execute the script multiple times, then you can see set of configs is not consistent. I have verified it is not just the order that changes. — Roxana, Jan 07 '21 at 20:55

Obtaining different set of configs across multiple calls in ray tune

2 Answers2

Wait, but why

Linked