Questions tagged [ray]

Ray is a library for writing parallel and distributed Python applications. It scales from your laptop to a large cluster, has a simple yet flexible API, and provides high performance out of the box.

At its core, Ray is a library for writing parallel and distributed Python applications. Its API provides a simple way to take arbitrary Python functions and classes and execute them in the distributed setting.

Learn more about Ray:

Ray also includes a number of powerful libraries:

  • Cluster Autoscaling: Automatically configure, launch, and manage clusters and experiments on AWS or GCP.
  • Hyperparameter Tuning: Automatically run experiments, tune hyperparameters, and visualize results with Ray Tune.
  • Reinforcement Learning: RLlib is a state-of-the-art platform for reinforcement learning research as well as reinforcement learning in practice.
  • Distributed Pandas: Modin provides a faster dataframe library with the same API as Pandas.
702 questions
7
votes
1 answer

Is there a way to limit Ray object storage max memory usage

I'm trying to capitalize on Ray's parallelization model to process a file record by record. The code works beautifully, but the object storage grows fast and ends up crashing. I'm avoiding using ray.get(function.remote()) because it kills the…
LipeScia
  • 71
  • 1
  • 4
6
votes
3 answers

How do I stop a Python Ray Cluster

I've been looking through the documentation, and I can't find any way to stop Ray after an answer is calculated within a tolerance level. Right now I'm writing the answer to a csv with a print('Ok to stop'), and then I kill the process manually. I'd…
user15016496
6
votes
2 answers

How to use python Ray to parallelise over a large list?

I want to parallelise the operation of a function on each element of a list using ray. A simplified snippet is below import numpy as np import time import ray import psutil num_cpus =…
PyRsquared
  • 6,970
  • 11
  • 50
  • 86
6
votes
2 answers

Given a trained environment, how do I evaluate the policy at a specific state?

I've trained a Ray-RLlib PPOTrainer on a custom environment. How do I evaluate the policy at a specific state? Full example: from ray.rllib.agents.ppo import PPOTrainer from cust_env.envs import CustEnv from ray.tune.logger import…
Jeff
  • 316
  • 2
  • 9
6
votes
2 answers

Parallel: Import a python file from sibling folder

I have a directory tree working_dir\ main.py my_agent\ my_worker.py my_utility\ my_utils.py Code in each file is as follows """ main.py """ import os,…
Maybe
  • 2,129
  • 5
  • 25
  • 45
5
votes
2 answers

What does GCS server do and what does the acronym stand for?

Found this acronym in the docs of Ray Core, used for its main API server: [..] the head node needs to open several more ports: --port: Port of Ray (GCS server). The head node will start a GCS server listening on this port. Default: 6379.
mirekphd
  • 4,799
  • 3
  • 38
  • 59
5
votes
3 answers

Ray on slurm - Problems with initialization

I write this post because since I use slurm, I have not been able to use ray correctly. Whenever I use the commands : ray.init trainer = A3CTrainer(env = “my_env”) (I have registered my env on tune) , the program crashes with the following message…
5
votes
1 answer

How to clear objects from the object store in ray?

I am trying out the promising multiprocessing package ray. I have a problem I seem not be able to solve. My program runs fine the first time, but at a second run this Exception is raised on the ray.put() line: ObjectStoreFullError: Failed to put…
Stefan
  • 919
  • 2
  • 13
  • 24
5
votes
2 answers

Number of time steps in one iteration of RLlib training

I am new to reinforcement learning and I am working on the RL of a custom environment in OpenAI gym with RLlib. When I create a custom environment, do I need to specify the number of episodes in the __init__() method? ALso, when I train the agent…
user3443033
  • 737
  • 2
  • 6
  • 21
5
votes
1 answer

RLLib - Tensorflow - InvalidArgumentError: Received a label value of N which is outside the valid range of [0, N)

I'm using RLLib's PPOTrainer with a custom environment, I execute trainer.train() two times, the first one completes successfully, but when I execute it for the second time it crashed with an error:…
Devid Farinelli
  • 7,514
  • 9
  • 42
  • 73
5
votes
2 answers

Ray is much slower both than Python and .multiprocessing

I upload 130k json files. I do this with Python: import os import json import pandas as pd path = "/my_path/" filename_ending = '.json' json_list = [] json_files = [file for file in os.listdir(f"{path}") if…
Outcast
  • 4,967
  • 5
  • 44
  • 99
5
votes
2 answers

How to process massive amounts of data in parallel without using up memory with Python Ray?

I am considering using Ray for a simple implementation of parallel processing of data: there are massive amounts of data items to be processed which become available through a stream / iterator. Each item is of significant size a function should be…
jpp1
  • 2,019
  • 3
  • 22
  • 43
5
votes
2 answers

purpose of 'num_samples' in Tune of Ray package for hyperprameter optimization

I am trying to perform a hyper parameter optimization task for a LSTM (pure Tensorflow) with Tune. I followed their example on the hyperopt algorithm. In the example they have used the below line inside the 'config' section. "num_samples": 10 if…
Suleka_28
  • 2,761
  • 4
  • 27
  • 43
4
votes
1 answer

How do I wait for ray on Actor class?

I am developing Actor class and ray.wait() to collect the results. Below is the code and console outputs which is collecting the result for only 2 Actors when there are 3 Actors. import time import ray @ray.remote class Tester: def…
tompal18
  • 1,164
  • 2
  • 21
  • 39
4
votes
1 answer

Ray dashboard unavailable on local cluster

I am having a lot of problems with Ray due to show the dashboard. I'm just using Ray Core to parallelize tasks as you can see: $ pipenv --python 3.9 shell $ pip --version pip 22.0.2 from .../lib/python3.9/site-packages/pip (python 3.9) $ pip…
David Amrani
  • 187
  • 2
  • 16
1
2
3
46 47