1

On OSX, I have a very simple game loop that runs some OpenGL code, which are simply a few matrix transformations (almost all of which are computed on the GPU) and a texture who's binding is managed intelligently, so that it is not bound every frame. The texture simply rotates at a moderate pace. That is all. I am ticking at 60 ticks per second. The size of the window is 1300 by 650 (it actually does better in fullscreen, for some reason, but I am testing in windowed). I am getting about 350-400 fps. The CPU usage is always around 200% (about 50% of everything; is a quad core i5 processor). Also, weirdly enough, turning on Vertical Sync does not do anything at all. It just locks the FPS at 60, as expected, and the CPU usage stays at around 200%.

Here are the computer's statistics, if they are necessary:

  • iMac (27-inch, Late 2009)
  • Processor: 2.66 GHz Quad-Core Intel Core i5 (with turbo boost to 3.2 GHz)
  • Memory: 16 GB 1333 MHz DDR3
  • Graphics: ATI Radeon HD 4850 512 MB

And here is the game loop:

float time = 0.0f;
float currentTime = 0.0f;

float tickTimer = 0.0f;
float frameTimer = 0.0f;

int frames = 0;

float rate = 1.0f / 60.0f;
float deltaTime = rate;

const milliseconds sleepDuration(1);

glfwMakeContextCurrent(nullptr);

thread tickThread([&](){
    while(!glfwWindowShouldClose(window)) {
        while(tickTimer >= rate) {
            tickTimer -= rate;
            time += rate;

            mtx.lock();
            tickCallback(deltaTime);
            mtx.unlock();
        }
    }
});

thread renderThread([&](){
    glfwMakeContextCurrent(window);

    while(!glfwWindowShouldClose(window)) {
        glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);

        mtx.lock();
        renderCallback();
        mtx.unlock();

        this_thread::sleep_for(sleepDuration);

        frames++;

        if (frameTimer >= 1) {
            frameTimer = 0;
            fps = frames;
            frames = 0;
        }

        glfwSwapBuffers(window);
    }
});

while(!glfwWindowShouldClose(window)) {
    const float newTime = static_cast<float>(glfwGetTime());
    deltaTime = newTime - currentTime;
    currentTime = newTime;

    tickTimer += deltaTime;

    frameTimer += deltaTime;

    glfwPollEvents();
}

After reading some other similar questions, it seemed as though the loop was working too fast, completely unrestrained, and thus consuming far more CPU than necessary. To test this assumption, I set sleepDuration to be about 100 milliseconds, which should be more than enough time for the CPU to cool itself back down. It didn't. The usage only dropped to about 180% average. I tried putting a this_thread::sleep_for(sleepDuration) in the ticking part of the loop as well, and it reduced the usage to about 100%, which is significant, but still way to much for such a small amount of code. Re-Logic's Terraria runs at about 40% maximum on this computer. I have tried cutting the multithread, so that everything is just one big loop, I have tried async, I have even tried running GCC with the highest level of optimization, and am getting nothing.

I don't know if this would help any, but I am also experiencing frequent, yet reasonably short, lag spikes. They do not seem to show up in the FPS, so it must be the ticking. I expect is has something to do with my program consuming almost all of the computer's resources, and other programs have to fight to get their share, but that is just my theory. I really don't know that much about this. So, does anyone know what I'm doing wrong?

EDIT: Something I forgot to add; in similar questions, I have read that glfwSwapBuffers() seems to be the culprit. That is not so. When I remove it, the CPU usage does drop to ~140% (still a ridiculous number), and the FPS skyrockets to about ~6200, which makes sense, but I just don't get why that CPU is so high. It must be in the ticking (which is just computing a rotation matrix, nothing more).

Sus Among Us
  • 103
  • 2
  • 11
  • 1
    Look at what `tickThread` would do when `tickTimer` is less than `rate`... how do you avoid that situation? – Dmitri Jul 11 '16 at 18:27
  • 3
    You have 3 threads here, each running a loop. Two of them are just busily spinning all the time, and the third one is the GL thread, which should be the least of your concerns since it will be slowed down by the GL. I don't see how you would expect this code to _not_ max out at least two cores. – derhass Jul 11 '16 at 18:32
  • @derhass Well like I said I really do not know much about this, nor do I have any brain whatsoever. I see what both you and Dmitri are saying. For some reason it didn't occur to me what would happen when, as Dmitri said, `tickTimer` was less than `rate` (reiterate, of course, and burn more CPU). I always manage to make myself look pretty stupid on SO... Thank you though! Much appreciated – Sus Among Us Jul 11 '16 at 19:00
  • @Dmitri +1. Didn't even cross my mind... What should I do? I tried sleeping for a millisecond if `tickTimer` is less than `rate` and it's making things a bit laggy. – Sus Among Us Jul 11 '16 at 19:20
  • 1
    If you want to limit how frequently a thread's loop runs, check the time at the end of each loop pass and compare it to the previous time. Then find the difference between the desired interval between loop passes and the time your loop iteration took, and sleep for that long (but make sure it's less than the desired interval). – Dmitri Jul 11 '16 at 20:41
  • 2
    As derhass said, there's no reason why this code shouldn't max out at least two cores. As a side note, you have multiple threads reading and writing to variables without synchronization (tickTimer and frameTimer). This is called a data race and is undefined behavior in C++. You should order a copy of C++ Concurrency in Action to learn more about multi-threading in C++ ([also available as a pdf](http://www.bogotobogo.com/cplusplus/files/CplusplusConcurrencyInAction_PracticalMultithreading.pdf)). – jhoffman0x Jul 11 '16 at 22:34

0 Answers0