Performance: Offline surface for hit testing vs Triangle Intersection

Question

First, a disclaimer. I'm well aware of the std answer for X vs Y - "it depends". However, I'm working on a very general purpose product, and I'm trying to figure out "it depends on what". I'm also not really able to test the wide variety of hardware, so try-and-see is an imperfect measure at best.

I've been doing some googling, and I've found very little reference to using an offline render target/surface for hittesting. I'm not sure of the nomenclature, what I'm talking about though is using very simple shaders to render a geometry ID (for example) to a buffer, then reading the pixel value under the mouse to see what geometry is directly under the mouse pointer.

I have however, found 101 different tutorials on doing triangle intersection, a la D3DXIntersect & DirectX sample "Pick".

I'm a little curious on this - I would have thought using HW was the standard method. By all rights, it should be many orders of magnitude faster, and should scale far better.

I'm relatively new to graphics programming, so here are my assumptions, for you to disabuse.

1) A simple shader that does geometry transform & writes a Node + UV value should be nearly free.

2) The main cost in the HW pick method would be the buffer fetch, when getting the rendered surface back off the GPU for the CPU to read over. I have no idea how costly this is. us? ms? seconds? minutes?

3) This may be obvious, but I am assuming that Triangle Intersection (D3DXIntersect) is only possible on the CPU.

4) A possible cost people want to avoid is the cost of the extra render target(s) (zbuffer+surface). I'm a'guessing about 10 megs for 1024x1280 (std screen size?). This is acceptable to me, although if I could render a smaller surface (trade accuracy for memory) I would do so (is that possible?).

This all leads to a few thoughts.

1) For very simple scenes, triangle intersection may be faster. Quite what is simple/complex is hard to guess at this point. I'm looking at possible 100s of tris to 10000s. Probably not much more than that.

2) The HW buffer needs to be rendered regardless of whether or not its used (in my case). However, it can be reused without cost (ie, click-drag, where mouse tracks across a static scene)

2a) Possibly, triangle intersection may be preferable if my scene updates every frame, or if I have limited mouse interaction.

Now I've finished writing, I see a similar question has been asked: (3D Graphics Picking - What is the best approach for this scenario). My problem with this is (a) why would you need to re-render your picking surface for click-drag as your scene hasn't actually changed, and (b) wouldn't it still be faster than triangle intersection?

I welcome thoughts, criticism, and any manner of side-tracking :-)

Could you elaborate on this click-drag scenario? In my mind, when you click and drag something the selected object does not change while dragging. So you would only need a single selection test when the drag sequence was initiated. — Andon M. Coleman, Sep 18 '13 at 15:41
By the way, triangle intersection (or more generally, any sort of proxy geometry intersection test) is _usually_ faster as it does not require CPU/GPU synchronization. There are scenarios where pixel-perfect selection tests are preferred - situations where the GPU does dynamic LOD/visibility independent of the CPU (e.g. tessellated geometry or occlusion culling), etc. If your use-case can tolerate a multi-frame latency, a single pixel read-back is not all that expensive, especially if you are using deferred shading and already pack material/object id into the G-Buffers. — Andon M. Coleman, Sep 18 '13 at 15:48
Regarding your point 4 above. You don't need to use a screen-sized buffer to render into, or trade any accuracy. You can render into a tiny buffer using a narrow projection pointing in the direction of the mouse click. — GuyRT, Sep 18 '13 at 15:54
@AndonM.Coleman, click-drag is region select. I'm not sure precisely how I would implement that using triangle intersection - I thought that was one of the biggest pro's of offscreen rendering. — FrozenKiwi, Sep 18 '13 at 19:44
@AndonM.Coleman "tolerate a multi-frame latency" ?? Seriously? Multi-frame to read back a value? (so around 100ms?) I would not be ok with that... — FrozenKiwi, Sep 18 '13 at 19:46
Yes to multi-frame, no to 100 ms. I am not referring to the time it takes to VSYNC your buffer swap, which would limit you to 60 FPS in most cases (and therefore potentially 16+ ms to read something back without stalling the pipeline), but the amount of time it takes the GPU to flush all the commands in the pipeline. The whole point of doing a deferred read-back would be to avoid synchronizing the CPU and GPU, this is the premise behind PBOs, occlusion queries, etc. — Andon M. Coleman, Sep 18 '13 at 19:50
@GuyRT I considered a narrow window, but decided against it so I could cache the result as long as the scene is static. The mouse cursor should change in response to whats being moused over (indicate valid selection etc). I also figured that the render cost was minimal compared to memory access cost — FrozenKiwi, Sep 18 '13 at 19:52
@AndonM.Coleman Ok, this would make sense, as obviously we can't interrupt the render as it happens. AFAIK, the CPU/GPU is synchronized for immediate mode rendering at the end of each frame anyway, and I would render, so I wouldn't be waiting for the current render to finish. — FrozenKiwi, Sep 18 '13 at 20:14
@AndonM.Coleman If you want to put that info in the form of an answer (Generally, prefer triangle except in certain cases) I'd be happy to accept it as an answer — FrozenKiwi, Sep 18 '13 at 20:15
Perhaps later, I'm a little busy right now - I'll whip up something more useful than just a collection of these comments, because you clearly put a lot of thought into your question, it's the least you deserve. — Andon M. Coleman, Sep 18 '13 at 23:35

Performance: Offline surface for hit testing vs Triangle Intersection

0 Answers0