I am trying to benchmark GPUs. I have a Radeon 7670m (480 x 600 MHz), Intel 4000 (16? x 1100 MHz) and Radeon 4850 (800 x 625 MHz). I throw in a 4096 x 4096 Rg32f-texture and receive a Red-texture. Each pixel takes 400 - 800 ns. The 7670m is around 56% of the 4850 as expected from (480 x 600) / (800 x 625) = 58%. But, the 4000 is 74%, which is not expected from (16 x 1100) / (800 x 625) = 3.5%. Backwards, it seems the 4000 has 350 stream processors (?) yielding 77%.
The 7670m is faster than the 4000 if texture < 512 x 512 and shader time < 45 ns / pixel (perfect for gaming?) I want to run at 30 minutes per texture, which means 10 minutes extra on 7670m.
Now, could it be that the 4000 has 350 streams or is the 7670m just optimized for small(?) texture gaming (and I've been gypped again) or is there some way to get the 7670m to run faster through OpenGL ?
I'll try any hints on this test bed!
Here are the shaders:
#version 330
uniform sampler2D inData
in vec2 glFragCoord
out float outData
void main(void)
{
float x = texture(inData, glFragCoord.xy).r;
..... do stuff at 400 - 800 microSeconds / Pixel randomly
outData = x;
}
#version 330
in vec2 position;
out vec2 glFragCoord;
void main()
{
glFragCoord = position * vec2(0.5) + vec2(0.5);
gl_Position = vec4(position, 0, 1);
}
Here is the texture/buffer stuff:
float[] data = new float[textureSize * textureSize * 2];
Int32 frameBufferId = GL.GenFramebuffer();
GL.BindFramebuffer(FramebufferTarget.Framebuffer, frameBufferId);
Int32 textureId = GL.GenTexture();
GL.BindTexture(TextureTarget.Texture2D, textureId);
GL.TexImage2D(TextureTarget.Texture2D, 0, PixelInternalFormat.Rg32f, textureSize, textureSize, 0, PixelFormat.Rg, PixelType.Float, data);
GL.FramebufferTexture2D(FramebufferTarget.Framebuffer, FramebufferAttachment.ColorAttachment0, TextureTarget.Texture2D, textureId, 0);
GL.TexParameter(TextureTarget.Texture2D, TextureParameterName.TextureMinFilter, (Int32)TextureMinFilter.Linear);
and here is the new stuff (Ortho, Begin, Vertex2, and End were removed):
Int32 arrayBufferId = GL.GenBuffer();
GL.BindBuffer(BufferTarget.ArrayBuffer, arrayBufferId);
float[] arrayBufferData = new float[4 * 2] { -1, -1, 1, -1, 1, 1, -1, 1 };
GL.BufferData(BufferTarget.ArrayBuffer, new IntPtr(4 * 2 * sizeof(float)), arrayBufferData, BufferUsageHint.StaticDraw);
Int32 positionId = GL.GetAttribLocation(programId, "position");
GL.EnableVertexAttribArray(positionId);
GL.VertexAttribPointer(positionId, 2, VertexAttribPointerType.Float, false, 0, 0);
GL.Viewport(0, 0, textureSize, textureSize);
GL.DrawArrays(PrimitiveType.Quads, 0, 4);
and here is the timing stuff:
float[] result = new float[textureSize * textureSize];
Stopwatch stopWatch = new Stopwatch();
stopWatch.Start();
GL.ReadPixels(0, 0, textureSize, textureSize, PixelFormat.Red, PixelType.Float, result);
stopWatch.Stop();
Different timer:
Int32 timerQuery = GL.GenQuery();
GL.BeginQuery(QueryTarget.TimeElapsed, timerQuery);
GL.ReadPixels(0, 0, textureSize, textureSize, PixelFormat.Red, PixelType.Float, result);
GL.EndQuery(QueryTarget.TimeElapsed);
Int32 done = 0;
while (done != 1) { GL.GetQueryObject(timerQuery, GetQueryObjectParam.QueryResultAvailable, out done); }
Int64 elapsedTime = 0;
GL.GetQueryObject(timerQuery, GetQueryObjectParam.QueryResult, out elapsedTime);
Console.WriteLine("GPU query = " + (elapsedTime / 1000000.0).ToString() + " ms");