17

Khronos just released their new memory model extension, but there is yet to be an informal discussion, example implementation, etc. so I am confused about the basic details.

https://www.khronos.org/blog/vulkan-has-just-become-the-worlds-first-graphics-api-with-a-formal-memory-model.-so-what-is-a-memory-model-and-why-should-i-care

https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/vkspec.html#memory-model

What do these new extensions try to solve exactly? Are they trying to solve synchronization problems at the language level (to say remove onerous mutexes in your C++ code), or is it a new and complex set of features to give you more control over how the GPU deals with sync internally?

(Speculative question) Would it be a good idea to learn and incorporate this new model in the general case or would this model only apply to certain multi-threaded patterns and potentially add overhead?

RWilco8
  • 251
  • 5
  • 17

2 Answers2

18

Most developers won't need to know about the memory model in detail, or use the extensions. In the same way that most C++ developers don't need to be intimately familiar with the C++ memory model (and this isn't just because of x86, it's because most programs don't need anything beyond using standard library mutexes appropriately).

But the memory model allows specifying Vulkan's memory coherence and synchronization primitives with a lot less ambiguity -- and in some cases, additional clarity and consistency. For the most part the definitions didn't actually change: code that was data-race-free before continues to be data-race-free. For a few developers doing advanced or fine-grained synchronization, the additional precision and clarity allows them to know exactly how to make their programs data-race-free without using expensive overly-strong synchronization.

Finally, in building the model the group found a few things that were poorly-designed or broken previously (many of them going all the way back into OpenGL). They've been able to now say precisely what those things do, whether or not they're still useful, and build replacements that do what was actually intended.

The extension advertises that these changes are available, but even more than that, once the extension is final instead of provisional, it will mean that the implementation has been validated to actually conform to the memory model.

Jesse Hall
  • 6,441
  • 23
  • 29
  • On one hand I am glad these great extensions are being added but I now have to refactor code and buy another book :( – RWilco8 Sep 15 '18 at 16:04
  • 1
    Out of curiosity, why is that? For most developers, it shouldn't invalidate anything they were already doing (and if it does, they were probably going to run into portability problems between implementations anyway). – Jesse Hall Sep 16 '18 at 14:57
  • Don't we have to use different data structures and api calls to make use of this new extension? I have a highly parallel Vulkan API for my engine with a custom memory allocator. Lots of mutexes and stuff (actually Ada protected types) to make sure things get set on the GPU memory before use. – RWilco8 Sep 16 '18 at 17:09
  • 2
    No, the only API change is a physical device feature query struct that tells you whether the memory model is supported, and if so, whether it supports the "Device" scope (vs "Queue" scope as the broadest scope). Nothing significant should need to change in your host code, unless it was incorrect before. – Jesse Hall Sep 17 '18 at 06:32
5

Among other things, it enables the same kind of fine grained memory ordering guarantees for atomic operations that are described for C++ here. I would venture to say that many, perhaps even most, PC C++ developers don't really know much about this because the x86 architecture basically makes the memory ordering superfluous.

However, GPUs are not x86 architecture and compute operations executed massively parallel on GPU shader cores probably can, and in some cases must use explicit ordering guarantees to be valid when working against shared data sets.

This video is a good presentation on atomics and ordering as it applies to C++.

Jherico
  • 28,584
  • 8
  • 61
  • 87