Questions tagged [ray-tune]
72 questions
0
votes
1 answer
Unable to install ray[tune] tune-sklearn
I'm trying to install ray[tune] tune-sklearn on my machine but keeps failing. I'm using a MacBook Pro 2019 with Big Sur Version 11.6 and Python 3.9.7 (default, Sep 16 2021, 08:50:36) [Clang 10.0.0 ] :: Anaconda, Inc. on darwin. All other packages…

user1857403
- 289
- 5
- 12
0
votes
1 answer
python ray tune unable to stop trial or experiment
I am trying to make ray tune with wandb stop the experiment under certain conditions.
stop all experiment if any trial raises an Exception (so i can fix the code and resume)
stop if my score gets -999
stop if the variable varcannotbezero gets…

user670186
- 2,588
- 6
- 37
- 55
0
votes
1 answer
Use GPU OR CPU on Ray tune
I have 1 GPU and 32 CPUs available in my machine. Is it possible in Ray to use them separatelly? For instance, one task gets allocated with 1 CPU and another task with 1 GPU?
If I use
tune.run(trainer_fn,
num_samples=32,
…

Douglas C Vasconcelos
- 589
- 4
- 15
0
votes
1 answer
How to restore a ray-tune checkpoint when it is integrated with Pytorch Lightning?
I have a ray tune analysis object and I am able to get the best checkpoint from it:
analysis = tune_robert_asha(num_samples=2)
best_ckpt = analysis.best_checkpoint
But I am unable to restore my pytorch lightning model with it.
I…

Luca Guarro
- 1,085
- 1
- 11
- 25
0
votes
1 answer
Ray Tune error when using Trainable class with tune.with_parameters
Using very simple example from tune documentation itself:
from ray import tune
import numpy as np
class MyTrainable(tune.Trainable):
def setup(self, config, dataset=None):
print(config, dataset)
self.dataset = dataset
…

Snufkin
- 25
- 1
- 8
0
votes
1 answer
Is there an `initial_workers` (cluster.yaml) replacement mechanism in ray tune?
I shortly describe my use case: Assuming I wanted to spin up a cluster with 10 workers on AWS:
In the past I always used initial_workers: 10, min_workers: 0, max_workers: 10 options (cluster.yaml) to initially spin up the cluster to full capacity…

Denis
- 13
- 2
0
votes
1 answer
TuneError: ('Trials did not complete')
I wrote a program using keras that detects real texts from fake (I used 5000 training data and 10,000 test data), I used Transformer and 'distilbert-base-uncased' model for detection. Now I decide to hyperparameters tuning using the grid search ,…

fateme shamshiri
- 69
- 7
0
votes
0 answers
Ray.Tune's PB2 fails consistently on the same actor at the same training point because Tune code returns a ValueError
I have started several trials using ray.tune's PB2. They use 8 actors and perturb every 20 steps. Actors 0-6 don't have any trouble, but then actor 7, in the second 20-step epoch, consistently catches an error. In the terminal, I get the following…

LaMaster90
- 11
- 1
0
votes
1 answer
type 'NoneType' is not iterable error when training pytorch model with ray tunes Trainable API
I wrote a simple pytorch script to train MNIST and it worked fine. I reimplemented my script to be with Trainable class:
import numpy as np
import torch
import torch.optim as optim
import torch.nn as nn
from torchvision import datasets,…

Alex Goft
- 1,114
- 1
- 11
- 23
0
votes
1 answer
How do I make ray.tune.run reproducible?
I'm using Tune class-based Trainable API. See code sample:
from ray import tune
import numpy as np
np.random.seed(42)
# first run
tune.run(tune.Trainable, ...)
# second run, expecting same result
np.random.seed(42)
tune.run(tune.Trainable,…

ptyshevs
- 1,602
- 11
- 26
0
votes
1 answer
Using Ray-Tune with sklearn's RandomForestClassifier
Putting together different base and documentation examples, I have managed to come up with this:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
def objective(config, reporter):
for i in range(config['iterations']):
…

LeggoMaEggo
- 512
- 1
- 9
- 24
0
votes
1 answer
Can you use different stopping conditions for schedulers versus general tune trials
In Ray Tune, is there any guidance on whether using different stopping conditions for a scheduler versus a trial is fine to do?
Below, I have an async hyperband scheduler stopping based on neg_mean_loss, and tune itself stopping based on…

rasen58
- 4,672
- 8
- 39
- 74