Questions tagged [ray]

Ray is a library for writing parallel and distributed Python applications. It scales from your laptop to a large cluster, has a simple yet flexible API, and provides high performance out of the box.

At its core, Ray is a library for writing parallel and distributed Python applications. Its API provides a simple way to take arbitrary Python functions and classes and execute them in the distributed setting.

Learn more about Ray:

Ray also includes a number of powerful libraries:

  • Cluster Autoscaling: Automatically configure, launch, and manage clusters and experiments on AWS or GCP.
  • Hyperparameter Tuning: Automatically run experiments, tune hyperparameters, and visualize results with Ray Tune.
  • Reinforcement Learning: RLlib is a state-of-the-art platform for reinforcement learning research as well as reinforcement learning in practice.
  • Distributed Pandas: Modin provides a faster dataframe library with the same API as Pandas.
702 questions
0
votes
1 answer

How to step envs in parallel using one environment per worker?

We have built a system consisting of Docker containers, each running Ray. One container takes on the role of head and the others as workers. Is there a way to run our custom env's steps in parallel, while one env per worker per container is…
0
votes
1 answer

question of gradients syncronization of resnet example

ray provide an example for resnet distribution training. but the gradients syncronization is strange: sync up weigths train each worker a specific steps independently. go back to step 1. Is there any justification for this workflow? I think it's…
scott huang
  • 2,478
  • 4
  • 21
  • 36
0
votes
1 answer

Passing a user-defined object between sub-processes

I have three files as follows """ main.py """ import time import ray class LocalBuffer(dict): def __call__(self): return self @ray.remote class Worker(): def __init__(self, learner): self.local = LocalBuffer() …
Maybe
  • 2,129
  • 5
  • 25
  • 45
0
votes
2 answers

can't find rllib command after install ray

I want try this command: rllib train --env=Pong-ram-v4 --run=PPO but rllib is can't find, I can do this by directly execute train.py in rllib source code, but run with a command is certainly more elegant. can some one tell me what's wrong? here is…
scott huang
  • 2,478
  • 4
  • 21
  • 36
0
votes
1 answer

Setup cluster between multiple AWS accounts

I would like to setup a Ray cluster to use Rtune over 4 gpus on AWS. But each gpu belongs to a different member of our team. I have scoured available resources for an answer and found nothing. Help ?
0
votes
1 answer

Ray on_train_result callback retrieving episode_id

I am using the APEX-DQN Agent (AsyncReplayOptimizer) from Ray/RLLib. I want to use some episode data info["episode"].user_data from the callback on_episode_end(info) to alter the info["result"] dictionary in on_train_result(info). Is there anyway…
0
votes
1 answer

How to get the full CPU usage on Ray cluster, while executing RLlib algorithms?

I am trying to run rllib algorithms on ray cluster. I get the following message as, "Memory usage on this node: 20.8/64.4 GB" How shall i make it use the memory fully? How to cross check whether the GPU or CPU utilization is above 90%? Kindly…
Deepak Nayak
  • 157
  • 1
  • 1
  • 7
0
votes
0 answers

Python Code Ray Casting Multiple Points

I am trying write a Ray Casting python code, that will return a list of True or False, that will show if points are in polygons. I have a List of that contains Lat/Lons of multiple polygons, and two additional Lists, one being the Latitude, and the…
friedmmj
  • 1
  • 3
-1
votes
0 answers

import deltaray library error - 'type' object is not subscriptable

I wonder if anyone can give me an advice on why I am getting this error when I install the deltaray library. I am trying to run their demo on Google Colab notebook. The installation seems to run fine and I restart the runtime as requested. Here's…
Juliana
  • 31
  • 4
-1
votes
1 answer

Simultaneous matrix vector multiplication with multiprocessing

What I want to do I have an m x n numpy array A where m << n that I want to load on a node where all 20 CPUs on that node can share memory. On each CPU, I want to multiply A by a n x 1 vector v, where the vector v differs on each CPU but matrix A…
-1
votes
1 answer

Update progressbar inside multiprocessor functions (Ray) simultaneously

I'm writing a program that uses ray package for multiprocessing programming. In the program, there is a function that would be called 5 times at the same time. During the execution, I want to show a progress bar using PyQT5 QprogressBar to indicate…
-1
votes
1 answer

Asynchronous Training with Ray

I want to be able to throw at some ray workers a lot of data collection tasks where a trainer is working concurrently and asynchronously on another cpu training on the collected data, the notion resembles this example from the docs:…
Gabizon
  • 339
  • 4
  • 15
-1
votes
1 answer

How to query docker container from host?

I am running a server within a docker container on my machine. I want to send HTTP requests from my machine (host) to the server which is inside the container. When I send a GET request with HTTP inside the container, it detects the server. But if I…
ava_punksmash
  • 357
  • 1
  • 4
  • 13
-1
votes
1 answer

Intersection of a ray with a segment in 3d

Ray and segment of polygon lie in the same plane. The normal vector of this plane is known. I need to know if a ray intersects this segment
user14205063
-1
votes
1 answer

Why is Ray Tune only using one worker?

Ray is starting only one worker, even though enough GPUs and CPUs are available to launch more workers. How can I increase the number of workers?
Tom Dörr
  • 859
  • 10
  • 22
1 2 3
46
47