Do I need cache invalidate before reading from frame buffer?

Question

I'm trying to read from /dev/fb0 on a Linux machine.

I just open("/dev/fb0", O_RDWR), then mmap, then memcpy from the mapped pointer.

Everything seems fine, except the top right corner of the image I get is from the previous frame.

It seems like a cache coherency problem to me. Specifically I'm running it on an ARM chip where GPU and CPU shares memory.

Is it true? Is it common practice to invalidate cache after mmaping to frame buffer?

If I need to invalidate memory, which API call shall I use?

I'm trying Memory.h from TI's SDK, but is there a more standard/Linux/Posix alternative?

@artlessnoise I posted this question because I was suspecting GPU is has written into frame buffer while CPU's cache is not aware of it. But I guess I was wrong. I'm trying to dump a frame from frame buffer after the rendering finishes, but the dumped image is always missing a top right corner, although it shows OK on the screen. I tried to add a 4ms delay after the render finish signal, and it fixes it, so I'm now focusing on why the complete signal arrives early, or why the rendering finishes late. — user3528438, Mar 05 '15 at 19:30
*GPU is has written into frame buffer while CPU's cache*; Ah, that is not a 'frame buffer' and the DRM drivers are needed. It is possible that a GPU has written to the display, but I am not sure if the GPU writes show up. All of my previous comments are for a *pure framebuffer* device. GPU memory might not even be accessible from the main CPU. The GPU memory might not even be in the same format (RGB16 vs 24, etc). Sometimes the display is composited from different layers (video memory). — artless noise, Mar 05 '15 at 20:04
One thing to try if you suspect CPU cache effects is to, on another thread, generate as much memory traffic as you can (e.g. continuously `memcpy` buffers several times larger than the cache) in order to pollute the cache and make "legitimate" accesses far less likely to hit. However, I take "missing a corner" to mean a neat square/rectangle, rather than some cache-line-sized runs of pixels, which does smack more of a tile-based GPU combined with some kind of write buffering at that end. — Notlikethat, Mar 05 '15 at 23:17

Cool Goose · Answer 1 · 2016-02-06T20:36:57.667

1

Make sure GPU finished writing of data into buffer before memcpy(). In this case you need not to invalidate cache as there will be no caching of your newly mmapped buffer. If you suspect there is any cached data you are copying you can use following API to invalidate cache:

outer_cache.inv_range()

See header file arch/arm/include/asm/outercache.h.

edited Feb 06 '16 at 20:36

answered Feb 06 '16 at 19:55

Cool Goose

870
10
16

Do I need cache invalidate before reading from frame buffer?

1 Answers1