This is my first time trying to use Ray.Tune for hyperparameter optimization. I am confused as to where in the Ray code I should initialize the dataset as well as where to put the for-loops for defining the epoch and enumerating the dataset batches.
Background
In my normal training script, I follow several steps:
1. Parse the model options,
2. Initialize the dataset,
3. Create and initialize the model,
4. For-loop for progressing through the epochs,
5. Nested for-loop for enumerating the dataset batches
The Ray.Tune documentation says that when defining the Trainable class object, that I really only need _setup, _train, _save, and _restore. As I understand it, the _train() is for a single iteration and increments the training_iteration automatically. Given that the dataset_size may not be cleanly divisible by the batchSize, I calculate the total_steps as the training progresses. If I understand it right, my total_steps will not be equal to training_iteration. This is important because the number of steps is supposed to be used to determine when to evaluate the worker.
I also do not want to instantiate the dataset for each worker individually. Ray should instantiate the dataset once, and then the workers can access the data via shared memory.
Original train.py code
self.opt = TrainOptions().parse()
data_loader = CreateDataLoader(self.opt)
self.dataset = data_loader.load_data()
self.dataset_size = len(data_loader)
total_steps = 0
counter = 0
for epoch in range(self.opt.starting_epoch, self.opt.niter + self.opt.niter_decay + 1):
for i, data in enumerate(self.dataset):
total_steps += self.opt.batchSize if i<len(self.dataset) else (self.dataset_size * (epoch + 1)) - total_steps
counter += 1
self.model.set_input(data, self.opt)
self.model.optimizeD()
if counter % self.opt.critic_iters == 0:
self.model.optimizeG()