glDrawArrays: when has it finished?

Question

Pseudocode:

void draw()
{
    Vertex* vertices = scene.GetVertexArray();
    glEnableClientState(...);
    glVertexPointer(..., vertices);
    glDrawArrays(...);
    glDisableClientState(...);
    delete vertices;
}

I'm not using VBO since I want to support older OpenGL implementations.

After calling glDrawArrays, I want to:

deallocate my vertex array ("delete vertices;")
perhaps modify some of the vertices

However, GL is free to perform the glDrawArrays asynchronously, and it's not safe to deallocate or modify my array until it has finished.

I could do a glFinish to ensure that, but it'd slow down the app.

So at what moment am I free to deallocate/modify my vertex array?

datenwolf · Answer 1 · 2011-05-02T10:27:01.587

6

OpenGL guarantees you, that once any function that actually access some memory returns, you can change or deallocate it's contents. Those functions are:

glDrawArrays (after it returns, the memory gl{Normal,Color,TexCoord,Attrib,Vertex}Pointer was set to can be disposed)
glDrawElements (after it returns, the memory gl{Normal,Color,TexCoord,Attrib,Vertex}Pointer was set to and the element array can be disposed)
glTexImage (memory data points to)
glTexSubImage (memory data points to)
glBufferData (memory data points to)
glBufferSubData (memory data points to)

It is important to know that gl{Normal,Color,TexCoord,Attrib,Vertex}Pointer just set a pointer and don't create a copy. However a copy of sort of the data is made by glDrawElements and glDrawArrays calls (depending on the driver a physical copy is not made immediately but the memory management adjusted for a copy-on-write scheme — in case the buffer doesn't get modified or deallocated by the user programm this saves crucial bandwidth and CPU cycles).

edited May 02 '11 at 10:27

answered May 01 '11 at 11:49

datenwolf

159,371
13
185
298

Thanks. How is the copy-on-write implemented? How can the driver know I've changed the data? – coder123 May 01 '11 at 12:20
It is not, the driver does not know, and there is no copy-on-write, that's wishful thinking. If you don't use buffer objects, a copy will be made at the time you call a function like glDrawElements. The GL has no control over where your client data is located or that it is page-aligned, it would therefore have a very hard time implementing a copy-on-write scheme. That would be like asking for the disk driver to use DMA when reading into some arbitrary buffer of yours. – Damon May 01 '11 at 12:38
@Damon: I exactly wrote what you mentioned (please read my last paragraph again): By "glDrawElements and glDrawArrays calls". Next time please read carefully what I wrote. – datenwolf May 01 '11 at 12:54
@datenwolf: Hmm, you wrote that "usually not a copy is made by glDrawElements". – coder123 May 01 '11 at 12:56
@coder123: (Parts of the) Drivers usually live in kernel space. In kernel space they have the opportunity to fiddle with the memory management, what happens is like following: The pages with the data passed to the driver are marked as read only, and a second set of pages is marked as to be used by writes to this memory. Once the programm tries to modify those pages, a pagefault happens wich causes the kernel to jump into the pagefault handler. In that handler the page table is adjusted so that the user space process now uses the new pages, while the driver continues the use the original pages. – datenwolf May 01 '11 at 12:59
@coder123: The net effect is, that the program in user space sees nothing about, how this copy is created, to the program it still uses the same (virtual!) memory. As long as the program doesn't fiddle with the data no copy is made. But if it did change the data, the copy-on-write mechanism ensures, that the driver still has unmodified working copy. That way the (expensive) copy process only has to be done in the case it's actually required. In case the GPU has its own memory a copy will be made eventually, of course. – datenwolf May 01 '11 at 13:05
@datenwolf: I believe that I did read carefully. The paragraph that I see as wishful thinking is "usually not a copy is made but the memory management adjusted for a copy-on-write scheme" - there is no way that this is the case. – Damon May 01 '11 at 13:10
@Damon: In general it depends on the driver in question. I admit that I've never looked into the details of the Linux DRM kernel modules and also the NVidia driver kernel module (however I'll do that rather soon due to some project I'm working on). But from my knowledge from audio drivers, which are under a much larger latency pressure and thus must avoid copies where possible I can assure you, that copy-on-write mechanisms are empolyed regularity. Also the whole Linux networking stack uses copy-on-write: Between the send syscall and the NIC driver the content of a packet never gets copied. – datenwolf May 01 '11 at 13:17
See, a driver writer would have to be one stupid driver writer if he relied on COW to access _user-owned_ data. This is an entirely different thing than e.g. "between the send sycall and the NIC driver", because that is a properly aligned buffer owned by the kernel. The important detail is the copy from user to kernel space before, which is mandatory. You are perfectly allowed to write code like: `p = mmap("obj"); glVertexPointer(p...); glDrawElements(...); munmap(...);`. The driver has no way of knowing that the pages were unmapped, so if it relied on COW, that would crash the driver. – Damon May 01 '11 at 14:36
@Damon: No it woudln't crash; a comment doesn't suffice to explain in detail: it has to do with the way mmap works. It boils down to marking that address range as "swaped" paged memory on the blocks of the device where the file lies on: This usually causes the block layer to fetch the contents of these blocks into the disk cache. But more importantly those pages are allocated and backed by the memory management, and even if the program unmaps them, the marking by the driver is not lost and so the connection kept. – datenwolf May 01 '11 at 15:31
@Damon: BTW there is no such thing as "user owned data". All address space and memory is owned by the kernel; it just hands out portions of it to user space. Also malloc and free are usually implemented through mmap, munmap on anonymous memory, or /dev/shm (which is a tmpfs, which by itself is just a part of the block cache space with a filesystem overlay). And on the kernel side mmap and munmap are heavily loaded with all kinds of COW stuff; just look at fork syscall, which COWs the processes' address spaces. – datenwolf May 01 '11 at 15:36
Much of this is Linux specific, and I feel like our discussion is going somewhat off-topic beyond being useful. But even so, you can easily verify that such assumptions do not even "usually" hold true under Linux (I would not say one can't find a special case where they do (tee/splice comes to mind), but not "usually"). Fill a large memory block and io_submit a large write. Then immediately modify some data that is a couple of megabytes ahead. Under the assumption that the kernel owns all memory and thus will just do page magic, this should not corrupt any data. And yet, it does. – Damon May 01 '11 at 17:01
@Damon: Yes, it's Linux specific, nevertheles in this very example you break rules set by the API. glDraw{Elements,Arrays} are specified that after their return any modifications to the buffer will no longer influence the further operation of OpenGL; thus the parts in the driver implementing glDraw{Elements,Arrays} can set up the proper guards to protect its data. OTOH the specification of io_submit clearly states that the buffer must not be modified until completion; this constraint lifts the burden on the driver/implementation to preserve the data. – datenwolf May 01 '11 at 17:51
Of course, but it is irrelevant whether the io_submit API states that the buffer must not be modified until completion. The important detail is that the kernel developers _did not_ implement it in a modification-safe way, although this would make aio application development a lot easier and less error-prone and would use memory more efficiently, too. Which, if it is really a no-brainer, makes you wonder why. Having said that, I've had a look at the Gallium/Mesa sources (which is the base of the nVidia and ATI open source drivers, and the only GPU driver sources that I have access to) ... – Damon May 02 '11 at 11:31
... which, in the case of EXT_vertex_arrays/GL1.1 glDrawArrays (i.e. non-VBO) ends up calling `emit_element_old` in a loop, which in turn contains the line `memcpy(dst, ((GLubyte *) arrays->arrays[i].data) + offset, arrays->arrays[i].element_size);`. I admit that I may have been reading the code wrong, but to me it does not look very much like remapping pages. – Damon May 02 '11 at 11:32
@Damon: Which implementation? SGI Reference? Mesa? As I already wrote this is highly implementation dependant. On Unix systems the libGL.so comes with the driver; a driver dependent libGL.so required for direct rendering to work — IMHO this should have been done using a trampolin library, into which the driver's GL implementation hooks. On Windows the opengl32.dll offers this kind of trampoline into which the vendors ICD (Implementation Client Driver) hooks. Also I'm pretty sure the code of glDraw{Arrays,Elements} of NVidia's propriatary drivers has not been disclosed. – datenwolf May 02 '11 at 12:46

glDrawArrays: when has it finished?

1 Answers1