I'm trying to find out how RLLib makes efficient use of the frames, i.e. how it avoids saving repetitive frames into memory, which in the OpenAI baselines is done via LazyFrames.
In Ray's RLLib atari_wrapper.py
it seems that all observations are stored in plain ndarray format: https://github.com/ray-project/ray/blob/master/python/ray/rllib/env/atari_wrappers.py#L253
This stands in contrast to the common use of LazyFrames in the OpenAI baselines: https://github.com/openai/baselines/blob/master/baselines/common/atari_wrappers.py#L217
Was this done because PyArrow cannot make use of LazyFrames and requires numpy arrays? Even if that's the case, given that the _get_ob output in RLLib is a concatenated numpy array of the 4 observations, isn't the memory requirement much higher than let's say saving down each of the 4 observations separately and linking them via the ray object IDs? What am I missing here?