1

I've been developing for a bit an invisible (read: doesn't produce any visual output) stressor to test the capabilities of my graphics card (and as a exploration of DirectCompute in general, with which I'm pretty new). I've got the following code right now that I'm pretty proud of:

RWStructuredBuffer<uint> BufferOut : register(u0);

[numthreads(1, 1, 1)]
void CSMain( uint3 DTid : SV_DispatchThreadID )
{
    uint total = 0;
    float p = 0;
    while(p++ < 40.0){      
        float s= 4.0;
        float M= pow(2.0,p) - 1.0;
        for(uint i=0; i <= p - 2; i++)
        {
            s=((s*s) - 2) % M;
        }
        if(s < 1.0) total++;
    }
    BufferOut[DTid.x] = total;
}

This runs the Lucas Lehmer Test for the first 40 powers of two. When I dispatch this code in a timed loop and look at my graphics cards stats using GPU-Z, my GPU load shoots to 99% for the duration. I'm pretty happy with this, but I also notice that the heat generation from a fully loaded out GPU is actually pretty minimal (I'm getting about a 5 to 10 degree Celsius jump, nowhere near the heat jump I get when running, say, Borderlands 2). My thought is that most of my heat comes from memory accesses, so I would need to include consistent memory accesses across the run. My initial code looked like this:

RWStructuredBuffer<uint> BufferOut : register(u0);

groupshared float4 memory_buffer[1024];

[numthreads(1, 1, 1)]
void CSMain( uint3 DTid : SV_DispatchThreadID )
{
    uint total = 0;
    float p = 0;
    while(p++ < 40.0){
            [fastop] // to lower compile times - Code efficiency is strangely not what Im looking for right now.
            for(uint i = 0; i < 1024; ++i)


        float s= 4.0;
        float M= pow(2.0,p) - 1.0;
        for(uint i=0; i <= p - 2; i++)
        {
            s=((s*s) - 2) % M;
        }
        if(s < 1.0) total++;
    }
    BufferOut[DTid.x] = total;
}
Zach H
  • 469
  • 5
  • 18

1 Answers1

0

Read a lot of non-coherent samples in large textures. Try both DXT1 compressed and non-compressed values. And use render to texture. And MRT. All will beat on the GPU memory systems.

bjorke
  • 3,295
  • 1
  • 16
  • 20