I was curious about same so I started looking into the source code and this is what I found.
Open AI uses pyglet for displaying the window and animations.
For showing the animation everything is drawn on to window and then rendered.
And then pyglet stores what is being displayed on to a buffer.
Dummy version of how code is written in open AI
import pyglet
from pyglet.gl import *
import numpy as np
display = pyglet.canvas.get_display()
screen = display.get_screens()
config = screen[0].get_best_config()
pyglet.window.Window(width=500, height=500, display=display, config=config)
# draw what ever you want
#get image from the buffer
buffer = pyglet.image.get_buffer_manager().get_color_buffer()
image_data=buffer.get_image_data()
arr = np.frombuffer(image_data.get_data(),dtype=np.uint8)
print(arr)
print(arr.shape)
output:
[0 0 0 ... 0 0 0]
(1000000,)
so basically every image we get is from buffer of what is being displayed on the window.
So if we don't draw anything on window we get no image so that window is required to get the image.
so you need to find a way such that windows is not displayed but its values are stored in buffer.
I know its not what you wanted but I hope it might lead you to a solution.