I'm using glDrawArraysInstanced to draw 10000 instances of a simple shape composed of 8 triangles.
On changing the dedicated graphics card that is to be used to my NVIDIA GTX 1060, it seems i'm getting lower framerate and also some visible stuttering.
This is the code i'm using to see time taken for each frame :
std::chrono::steady_clock::time_point begin = std::chrono::steady_clock::now();
std::chrono::steady_clock::time_point end = std::chrono::steady_clock::now();
float i = (float)(std::chrono::duration_cast<std::chrono::microseconds>(end - begin).count()) / 1000000.0;
while (!glfwWindowShouldClose(window)){
end = std::chrono::steady_clock::now();
i = (float)(std::chrono::duration_cast<std::chrono::microseconds>(end - begin).count()) / 1000000.0;
std::cout << i << "\n";
begin = end; //Edit
//Other code for draw calls and to set uniforms.
}
Is this the wrong way to measure time elapsed per frame? If not, why is there a drop in performance?
Here is the comparison of the output :
Edit :
Fragment Shader simply sets color for each fragment directly.
Vertex Shader Code :
#version 450 core
in vec3 vertex;
out vec3 outVertex;
uniform mat4 mv_matrix;
uniform mat4 proj_matrix;
uniform float time;
const float vel = 1.0;
float PHI = 1.61803398874989484820459;
float noise(in vec2 xy, in float seed) {
return fract(tan(distance(xy * PHI, xy) * seed) * xy.x);
}
void main() {
float y_coord = noise(vec2(-500 + gl_InstanceID / 100, -500 + gl_InstanceID % 100),20) * 40 + vel * time;
y_coord = mod(y_coord, 40)-20;
mat4 translationMatrix = mat4(vec4(1,0,0, 0 ),vec4(0,1,0, 0 ),vec4(0,0,1, 0 ),vec4(-50 + gl_InstanceID/100, y_coord, -50 + gl_InstanceID%100,1));
gl_Position = proj_matrix * mv_matrix * translationMatrix*vec4(vertex, 1);
outVertex = vertex;
}
I'm changing the card used by Visual Studio for rendering here :
extern "C" {
_declspec(dllexport) DWORD NvOptimusEnablement = 0x00000001;
}
Output is same for both and is shown here :
Desired output is increased frame-rate while using dedicated GPU card to render, that is smaller time gaps between the rows in the Comparison image attached. For Intel Integrated Card, it takes <0.01 seconds to render 1 frame. For Dedicated GPU GTX 1060, it takes ~0.2 seconds to render 1 frame.