1

For each pixel written by a shader in Unity3D I would like to add it to a global variable somewhere so that I can read it back later. So, if the shader program iterates over 1000 pixels writing the color (1,0.5,0) the result I want back to my program would be (1000,500,0).

The actual calculation is much more complex of course, and very time consuming as it has to be done millions of times, doing calculations using multiple textures. So, I need to take advantage of the parallel computing ability of the GPU.

I read somethings about compute shaders that allows shaders to write numbers that can be read back, but I have a hard time to find any relevant example.

Any pointers would be useful.

sinsro
  • 192
  • 1
  • 10
  • You are right, for your case compute shaders are the ideal solution and the first thing that came to my mind when reading your post, https://docs.unity3d.com/Manual/ComputeShaders.html There are a number of examples on Compute Shaders over the internet, but you are going in the right direction. – Arman Papikyan Oct 21 '17 at 16:43
  • @ArmanPapikyan Thank you for your reply. I am studying Compute Shaders, and also, I found some interesting pointers related to atomic counters in OpenGL, but not sure how to integrate it into Unity. Are you familiar with this? Link: https://stackoverflow.com/questions/29352965/shader-for-counting-number-of-pixels – sinsro Oct 22 '17 at 02:03
  • This post tells that there is no such thing for unity shaders, I guess you could try appendBuffer, or a compute shader like this - https://en.wikibooks.org/wiki/Cg_Programming/Unity/Computing_Color_Histograms. In your case it wil be much easier solution. – Arman Papikyan Oct 22 '17 at 02:15
  • @ArmanPapikyan The histogram shader is a great and highly relevant example, thank you! :) However, I was very surprised to find it is much slower than I anticipated. I ran it 100 times on a 1024x1024 texture per update, and the FPS came to a crawl on a geForce 1080 TI. At first I thought maybe it is the atomic addition that caused the threads to slow down to sync writing, but even when allowing race conditions to occur I got the same performance. I thought this would be equivalent to rendering 100x1024x1024 textures 1:1 per frame, which would have been faster? – sinsro Oct 22 '17 at 14:30
  • I just ran a test, and found that running a shader on a 1024x1024 texture rendered to a 1024x1024 rendertexture is about 100 times faster than the equivalent computeshader test. – sinsro Oct 22 '17 at 14:52
  • That sounds very weird, although I have only some basic knowledge about compute shaders at best, I would still think that it should be at least with the same speed. Maybe you have to make some optimizations? Anyway, when you find a solution to your problem let me know, its an interesting problem. – Arman Papikyan Oct 22 '17 at 15:57
  • @ArmanPapikyan The benchmarking was done with the most basic shader, just reading from the texture and nothing much else (except for making sure the code was not optimized away). I am now looking into another way of doing this, which is to draw into a renderTexture using a regular shader, then once the renderTexture is rendered gradually reduce its size to a manageable level, like 4x4 pixels, while making sure no data is lost. Then use readPixels and getPixels to read the data. Unfortunately readPixels is super slow, and the speed seems indifferent to renderTexture size for some weird reason. – sinsro Oct 22 '17 at 18:29
  • Ran out of space.. Wanted to clarify there is a noticeable difference when reading from a renderTexture that is 1024x1024 compared to one that is 8x8 for example. What I meant by saying size does not matter is the overhead of issuing a readPixels seems very large. I see a massive 100x slowdown when benchmarking just by reading out 2x2 pixels from a 2x2 renderTexture in the main loop. Maybe it is because I am forcing the GPU to flush all operations. Not sure.. – sinsro Oct 22 '17 at 18:42

0 Answers0