0

I'm trying to find out how RLLib makes efficient use of the frames, i.e. how it avoids saving repetitive frames into memory, which in the OpenAI baselines is done via LazyFrames.

In Ray's RLLib atari_wrapper.py it seems that all observations are stored in plain ndarray format: https://github.com/ray-project/ray/blob/master/python/ray/rllib/env/atari_wrappers.py#L253

This stands in contrast to the common use of LazyFrames in the OpenAI baselines: https://github.com/openai/baselines/blob/master/baselines/common/atari_wrappers.py#L217

Was this done because PyArrow cannot make use of LazyFrames and requires numpy arrays? Even if that's the case, given that the _get_ob output in RLLib is a concatenated numpy array of the 4 observations, isn't the memory requirement much higher than let's say saving down each of the 4 observations separately and linking them via the ray object IDs? What am I missing here?

Muppet
  • 5,767
  • 6
  • 29
  • 39

1 Answers1

2

RLlib doesn't use LazyFrames. For algorithms that use large amounts of memory such as DQN, it instead compresses the observations using LZ4, which gives much higher savings at the cost of some extra CPU time.

Eric
  • 101
  • 1
  • Could you explain why you use `pyarrow.serialize(data).to_buffer().to_pybytes()` first when compressing? wouldn't it be sufficient to have a c-contiguous array and compress that? Why the need for each of these methods? – Muppet Jul 18 '19 at 23:11