Questions tagged [ray]

Ray is a library for writing parallel and distributed Python applications. It scales from your laptop to a large cluster, has a simple yet flexible API, and provides high performance out of the box.

At its core, Ray is a library for writing parallel and distributed Python applications. Its API provides a simple way to take arbitrary Python functions and classes and execute them in the distributed setting.

Learn more about Ray:

Ray also includes a number of powerful libraries:

  • Cluster Autoscaling: Automatically configure, launch, and manage clusters and experiments on AWS or GCP.
  • Hyperparameter Tuning: Automatically run experiments, tune hyperparameters, and visualize results with Ray Tune.
  • Reinforcement Learning: RLlib is a state-of-the-art platform for reinforcement learning research as well as reinforcement learning in practice.
  • Distributed Pandas: Modin provides a faster dataframe library with the same API as Pandas.
702 questions
3
votes
0 answers

Dead kernel when using ray

I tried to use ray for crawling some data. My original code before using ray is as below, and works well. def download(n): #download nth data return downloaded_data I use ray in reference to ray tutorial, and this makes my kernel…
3
votes
1 answer

How do dependencies get to a Ray cluster?

I’m trying to figure out whether Ray will work for an application, and I’m trying to understand how dependencies get to the workers in a Ray cluster. Ex: let’s say I have @ray.remote def foo(): a = do_something_requiring_pandas() b =…
augray
  • 3,043
  • 1
  • 17
  • 30
3
votes
0 answers

Averaging learning curves over repetitions with confidence intervals [Ray Tune, Tensorboard]

For reinforcement learning experiments, I often run independent repetitions for each hyperparameter setting. Ideally, I would visualize the average of these repetitions (per setting), including confidence intervals around the mean learning curve. I…
Thomas
  • 61
  • 2
3
votes
1 answer

Multiple-Actions in one step, Reinforcement learning

I am trying to write a custom openAI Gym environment in which the agent takes 2-actions in each step, one of which is a discrete action and the other is continuous one. I am using Ray RLLib and using SAC algorithm as it supports both discrete and…
JoCode
  • 31
  • 2
3
votes
0 answers

Shared Memory with Ray

I parallelise a CPU-bound task, which takes a large nested list as read-only input. To avoid that the nested list is repeatedly copied into the processes, I would like to make the object accessible via shared memory. I am working on a 64-bit Windows…
Rose
  • 51
  • 3
3
votes
1 answer

Ray object store running out of memory using out of core. How can I configure an external object store like s3 bucket?

import ray import numpy as np ray.init() @ray.remote def f(): return np.zeros(10000000) results = [] for i in range(100): print(i) results += ray.get([f.remote() for _ in range(50)]) Normally, when the object store fills up, it begins…
testgauss321
  • 77
  • 1
  • 5
3
votes
1 answer

Ray Dashboard error in the browser "ERR_CONNECTION_REFUSED"

When starting ray 1.2.0 in a jupyter notebook: import ray ray.init(num_cpus = 8) The log says that the dashboard is available on: 2021-02-25 16:21:23,636 INFO services.py:1174 -- View the Ray dashboard at http://127.0.0.1:8265 But I get…
Daniel Sobrado
  • 717
  • 1
  • 9
  • 22
3
votes
1 answer

Ray python example with multiple returns and calling ray.get()

The below code does the desired behaviour. Is it possible to pass the second argument from the first two functions without having to call ray.get prematurely ? @ray.remote def color(): image=cv2.imread("frame30.png", flags=0) argument=…
PolarBear10
  • 2,065
  • 7
  • 24
  • 55
3
votes
2 answers

Quick and resilient way to expose a python library API over the network as well as locally

I'm looking for an easy and dependency-light way of wrapping a python library to expose it over: a) The network, either via HTTP or some other custom protocol, it doesn't matter that much, and encryption isn't required. b) The local machine, the…
George
  • 3,521
  • 4
  • 30
  • 75
3
votes
2 answers

Launch AWS multi-node Ray cluster and submit simple python script to run in conda env

What I am finding with ray, is that the documentation is lacking for the autoscaling, and that the config is easily broken, without a clear reason why. I first wanted to pull a docker image to the cluster an run this, but the way ray handles…
3
votes
1 answer

Launching a simple python script on an AWS ray cluster with docker

I am finding it incredibly difficult to follow rays guidelines to running a docker image on a ray cluster in order to execute a python script. I am finding a lack of simple working examples. So I have the simplest docker file: FROM…
3
votes
1 answer

python parallel processing running all tasks on one core - multiprocessing, ray

I have a model.predict()-method and 65536 rows of data which takes about 7 seconds to perform. I wanted to speed this up using the joblib.parallel_backend tooling using this example. this is my code: import numpy as np from joblib import load,…
3
votes
1 answer

What is the difference between Neural Network Frameworks and RL Algorithm Libraries?

I know this is a silly question, but I cannot find a good way to put it. I've worked with TensorFlow and TFAgents, and am now moving to Ray RLlib. Looking at all the RL frameworks/libraries, I got confused about the difference between the two…
Kai Yun
  • 97
  • 8
3
votes
1 answer

Store objects between remote functions in Ray

I'm writing a project which writes using the same data a ton of times, and I have been using ray to scale this up in a cluster setting, however the files are too large to send back and forth/save on the ray object store all the time. Is there a way…
Xavanteex
  • 97
  • 5
3
votes
0 answers

Error in cloudpickle pickling with ray when parallelising a MatLab function using python

I would like to parallelise a MatLab function using python and ray. In this example: I used MathWorks to wrap a very simple matlab function as Python application (the function just returns as result the input integer - see below). Instructions…