Questions tagged [ray]

Ray is a library for writing parallel and distributed Python applications. It scales from your laptop to a large cluster, has a simple yet flexible API, and provides high performance out of the box.

At its core, Ray is a library for writing parallel and distributed Python applications. Its API provides a simple way to take arbitrary Python functions and classes and execute them in the distributed setting.

Learn more about Ray:

Ray also includes a number of powerful libraries:

  • Cluster Autoscaling: Automatically configure, launch, and manage clusters and experiments on AWS or GCP.
  • Hyperparameter Tuning: Automatically run experiments, tune hyperparameters, and visualize results with Ray Tune.
  • Reinforcement Learning: RLlib is a state-of-the-art platform for reinforcement learning research as well as reinforcement learning in practice.
  • Distributed Pandas: Modin provides a faster dataframe library with the same API as Pandas.
702 questions
0
votes
0 answers

Redis failed to start while running a program with Python Ray Library - RuntimeError: Couldn't start Redis

I am trying to use ray[tune] on a linux server where I am just a user without the root permission. After installing ray[tune] and running python code from ray import tune , I got an error message: ImportError: /lib64/libc.so.6: version `GLIBC_2.14'…
Salim
  • 373
  • 1
  • 9
  • 19
0
votes
1 answer

python ray not completing all remote calls

so i have a simple a script import ray import requests as r ray.init() @ray.remote def f(i): print(i) r.get("http://127.0.0.1:5000/" + str(i)) return i if __name__ == "__main__": for k in range(1000): f.remote(k) when i…
Ali Faki
  • 3,928
  • 3
  • 17
  • 25
0
votes
0 answers

How can I set up a Ray (a distributed programming framework) and Flask app on AWS Elastic Beanstalk?

Background I've been trying, unsuccessfully, to implement Ray (https://github.com/ray-project/ray) into the production version of our AI API. Essentially, we want to use it to speed up one of our clustering algorithms to reduce the delays that occur…
0
votes
1 answer

actor abstraction in Ray

When learning the Ray technology for the distributed Python, I read the following statement on Ray, and I am not fully understanding what does it really mean? Any explanations with some application context is appreciated. Actors: An important part…
user288609
  • 12,465
  • 26
  • 85
  • 127
0
votes
1 answer

Why does tensorboard not show all metrics?

I'm using Ray Tune for hyperparameter optimization and logging. Ray Tune successfully logs my scalar values and writes them to the Tensorboard log. The values do show up in Tensorboards 'SCALARS' section, but in the 'HPARAMS' section, only test_acc…
Tom Dörr
  • 859
  • 10
  • 22
0
votes
1 answer

Why is Ray Tune track not saving all datapoints?

The following code block is meant to save multiple metrics from my current run: track.log(pgd_loss=pgd_loss) track.log(pgd_acc=pgd_acc) track.log(test_loss=test_loss) track.log(test_acc=test_acc) However only the last line…
Tom Dörr
  • 859
  • 10
  • 22
0
votes
1 answer

Ray: Loop through list with additional arguments across cluster

I have a function that loops through a list: for id_ in id_list: scrape(id_, str(id_.split('/')[-1]), id_list.index(id_), len(id_list), target_path) I can run it in parallel using: for id_ in id_list: p1 = Process(target=scrape,…
0
votes
0 answers

ray.init() throws List Index Out of Range error

I am using version 0.8.6. The error occurs in the connect function in worker.py when ray.init() was run. I was able to trace the error and it occurs in the below part. if mode == SCRIPT_MODE: # Add the directory containing the script that is…
0
votes
1 answer

type 'NoneType' is not iterable error when training pytorch model with ray tunes Trainable API

I wrote a simple pytorch script to train MNIST and it worked fine. I reimplemented my script to be with Trainable class: import numpy as np import torch import torch.optim as optim import torch.nn as nn from torchvision import datasets,…
Alex Goft
  • 1,114
  • 1
  • 11
  • 23
0
votes
1 answer

Script breaks out of function after sending a few packets

I am writing a python program that sends packets for a specified amount of time. The sending script: import socket import time import networkparam import ray ray.init() transformer_sending_time = 0 final_message_sent = False @ray.remote def…
walter_dane
  • 115
  • 4
  • 8
0
votes
1 answer

How do I make ray.tune.run reproducible?

I'm using Tune class-based Trainable API. See code sample: from ray import tune import numpy as np np.random.seed(42) # first run tune.run(tune.Trainable, ...) # second run, expecting same result np.random.seed(42) tune.run(tune.Trainable,…
ptyshevs
  • 1,602
  • 11
  • 26
0
votes
1 answer

Synchronize a loop within ray.remote

I have enormous remote function that is parallelized with ray but within it, there is a loop I really need to be executed serially - each iteration to be globally executed once and only once. So, my thinking was to have a mutex to synchronize…
Stipe Galić
  • 153
  • 6
0
votes
1 answer

A string in a python function's parameter list

In this project, I encounter code like the following def update_prio_and_stats(item: ("ActorHandle", dict, int)): I know that this specifies that the argument passed to update_prio_and_stats should be a tuple of ("ActorHandle", dict, int). The dict…
Maybe
  • 2,129
  • 5
  • 25
  • 45
0
votes
1 answer

Errors when trying to use DQN algorithm for FrozenLake Openai game

I am trying to make a very simple DQN algorithm work with the FrozenLake-v0 game but I am getting errors. I understand that it could be an overkill using DQN instead of a Q-table, but I nonetheless would like it to work. Here is the code: import…
mikanim
  • 409
  • 7
  • 21
0
votes
1 answer

Disabling Hyperthreading on Nodes in an AWS EC2 Cluster through Ray Configuration

I have a task running on an EC2 cluster which starts to slow down progressively as virtual CPUs are employed (regardless of EBS volume size). To avoid this I want to disable hyperthreading on all nodes and was trying to implement the advice given…
Nick Mint
  • 145
  • 12