Why is depth buffers faster than depth textures?

Question

This tutorial on shadow-mapping in OpenGL briefly mentions the difference between using a depth buffer and a depth texture (edit: to store per pixel depth information for depth testing or other purposes, such as shadow-mapping) by stating:

Depth texture. Slower than a depth buffer, but you can sample it later in your shader

However, this got me wondering why this is so. After all, both seem to be nothing more than a two-dimensional array containing some data, and the definition on Microsofts notes on graphics define them in very similar terms as such (these notes are as pointed out in a comment, not on OpenGL but another graphical engine, but the purpose of the depth-buffers/-textures seem to be quite similar -- I have have not found an equal description of the two for OpenGL depth-buffers/-textures -- for which reason I have decided to keep these articles. If someone has a link to an article describing depth buffers and depth textures in OpenGL you will be welcome to post it in the comments)

A depth buffer contains per-pixel floating-point data for the z depth of each pixel rendered.

and

A depth texture, also known as a shadow map, is a texture that contains the data from the depth buffer for a particular scene

Of course, there are a few differences between the two methods -- notably, the depth texture can be sampled later, unlike the depth buffer.
Despite these differences, I can however not see why the depth buffer should be faster to use than a depth texture, and my question is, therefore: why can't these two methods of storing the same data be equally fast (edit: when used for storing depth data for depth testing).

Note, that both articles refer to XNA, not OpenGL. The last version of XNA was published in 2011. There are notable differences between what XNA could do and OpenGL. For example, it is absolutely possible to store stencil information in a texture by using a `GL_DEPTH_STENCIL` format. — BDL, Jul 26 '17 at 08:12
I think, but I can be mistaken, that depth renderbuffers permit some extra optimizations, like hierarchical Z-buffers, which depth textures cannot support. — Yakov Galka, Jul 26 '17 at 10:22

score 2 · Accepted Answer · answered Jul 26 '17 at 15:29

By "depth buffer", I will assume you mean "renderbuffer with a depth format".

Possible reasons why a depth renderbuffer might be faster to render to than a depth texture include:

A depth renderbuffer can live within specialized memory that is not shader-accessible, since the implementation knows that you can't access it from the shader.
A depth renderbuffer might be able to have a special format or layout that a depth texture cannot have, since the texture has to be shader-accessible. This could include things like Hi-Z/Hierarchical-Z and so forth.

#1 tends to crop up on tile-based architectures. If you do things right, you can keep your depth renderbuffer entirely within tile memory. That means that, after a rendering operation, there is no need to copy it out to main memory. By contrast, with a depth texture, the implementation can't be sure you don't need to copy it out, so it has to do so just to be safe.

Note that this list is purely speculative. Unless you've actually profiled it, or have some specific knowledge of hardware (as in the TBR case), there's no reason to assume that there is any substantial difference in performance.

Just a little refinement for 2. HiZ and Hierarchical-Z are the same thing. Usually, advanced Z techniques can be used even when using a depth texture. But after then rendering complete, and access is needed to the depth samples, a resolve/decompress pass needed, which can be pretty expensive - that's why a depth texture is slower. (But there may be GPUs today, which don't require a separate pass for this, they can sample directly from the compressed version). — geza, Jul 26 '17 at 20:44

Why is depth buffers faster than depth textures?

1 Answers1