9

I am working on a single-threaded graphical program that renders using SDL2. See the end for a smaller example.

It runs on both an old Linux machine and a somewhat less old Mac. The Linux computer has 1.60GHz processors while the Mac's processors are 2.2GHz. The SDL version on Linux is 2.0.8, while the SDL version of the Mac is 2.0.10. On both computers I compiled with clang++ using optimization flags -O3 and -flto. I invoked the executable with ./intergrid -fullscreen -pixel-size 3 (essentially, I had the program draw very many pixels.)

For some reason, the slower Linux computer executed the program with no sweat, while the Mac took several seconds to render the first frame. The Mac was faster than the Linux machine, as expected, when I used the -no-draw flag to disable graphics.

EDIT: The Linux computer has "Intel Haswell Mobile" for graphics and the Mac lists "Intel Iris Pro 1536 MB."

Here is a minimal reproducible example:

#include <SDL2/SDL.h>
#include <stdio.h>

int main(void)
{
    SDL_Init(SDL_INIT_VIDEO | SDL_INIT_TIMER);

    SDL_Window *window = SDL_CreateWindow(
        "Example",
        SDL_WINDOWPOS_UNDEFINED, SDL_WINDOWPOS_UNDEFINED,
        0, 0,
        SDL_WINDOW_SHOWN);
    SDL_SetWindowFullscreen(window, SDL_WINDOW_FULLSCREEN_DESKTOP);

    SDL_Renderer *renderer = SDL_CreateRenderer(window, -1, 0);

    SDL_Rect viewport;
    SDL_RenderGetViewport(renderer, &viewport);

    // The screen is not rendered to unless this is done:
    SDL_Event event;
    while (SDL_PollEvent(&event))
        ;

    Uint32 ticks_before = SDL_GetTicks();
    for (int x = 0; x < viewport.w - 10; x += 10) {
        for (int y = 0; y < viewport.h - 10; y += 10) {
            // I just chose a random visual effect for this example.
            SDL_Rect square;
            square.x = x;
            square.y = y;
            square.w = 10;
            square.h = 10;
            SDL_SetRenderDrawColor(renderer, x % 256, y % 256, 255, 255);
            SDL_RenderFillRect(renderer, &square);
        }
    }
    Uint32 ticks_after = SDL_GetTicks();
    printf("Ticks taken to render: %u\n", ticks_after - ticks_before);

    SDL_RenderPresent(renderer);

    SDL_Delay(500);

    // I Won't worry about cleaning stuff up.
}

I compiled this on Mac and Linux with clang++ -O3 -flto <filename> -lSDL2. When I ran the program on the Mac, it printed:

Ticks taken to render: 849

The program on Linux printed:

Ticks taken to render: 4

That's a gigantic difference!

JudeMH
  • 485
  • 2
  • 10
  • 3
    Have you profiled your code? What graphics chipsets and drivers are in the different computers? – Dai Jan 12 '20 at 02:50
  • @Dai I really don't have much experience with graphics, or profiling, in fact. I have looked at the "graphics" sections in the "about this computer" information for both computers and added it to the question. If the difference is just due to different GPUs and is not fixable, I can close the question. – JudeMH Jan 12 '20 at 03:15
  • Not sure but maybe SDL is doing is it choose 30bit (10bpc) color for the Mac while Linux stays 24bit. Maybe playing with pixel formats worth checking: https://forums.libsdl.org/viewtopic.php?p=46654 and https://wiki.libsdl.org/SDL_PixelFormat – Abdurrahim Jan 12 '20 at 03:34
  • How slow? Significantly more than ~16.6 milliseconds/frame? Right around ~16.6 milliseconds/frame? How many milliseconds per frame was Linux doing? Also, edit in a [mcve]. – genpfault Jan 12 '20 at 04:03
  • 1
    @genpfault I have added a better example. – JudeMH Jan 12 '20 at 12:07
  • 2
    @JudeMH is it with metal or opengl renderer (`SDL_GetRendererInfo`, `name` field)? Could you check with each (`SDL_SetHint(SDL_HINT_RENDER_DRIVER, "opengl")` *before* creating window/renderer but after `SDL_Init`)? Generally there is a problem in how you're measuring time - a lot of actual work happen in parallel and only blocks when queue is full, but 849 is way too high. – keltar Jan 12 '20 at 14:22
  • @keltar You just solved my problem. OpenGL was being used by default on Linux, but the Mac was using Metal. By calling `SDL_SetHint(SDL_HINT_RENDER_DRIVER, "opengl")`, I got the Mac to run as fast as the Linux machine. I still don't know why Metal is so much slower, but my problem is solved, which is the important part. – JudeMH Jan 12 '20 at 14:53
  • Switching to OpenGL on macOS, which has been deprecated and will be removed in the future, doesn't seem like the right solution here. I'm also seeing massively lower FPS in Metal vs. OpenGL, at least when using the built-in SDL drawing APIs. In fact, I'm convinced Metal is somehow CPU bound, not GPU bound. – Dids Sep 01 '20 at 12:09
  • 1
    @Dids I'm just setting the hint, so I think the program will still run and hopefully use Metal when OpenGL is removed. Hopefully the Metal backend will be faster by that time. It does seem likely that Metal is currently CPU bound for some reason. The `SDL_SetHint` solution is working for me at the moment, but there might be a better way. – JudeMH Sep 02 '20 at 23:07
  • @JudeMH That's a good and totally valid point! I've recently realized just how "unstable" the current Metal support is. Not just issues with performance, but there seems to be issues with renderer clipping too. I'll likely end up doing exactly what you've done, trusting the SDL2 devs to figure it out eventually. Thank you. :) – Dids Sep 04 '20 at 04:22

1 Answers1

10

@keltar found a solution that is good enough for me, but they have not yet posted it as an answer, so I will. For some reason, SDL2's Metal back end is immensely slow, so the solution is to use the OpenGL back end. I accomplished this by calling SDL_SetHint(SDL_HINT_RENDER_DRIVER, "opengl") if I found that the default driver was Metal (using SDL_GetRendererInfo.)

JudeMH
  • 485
  • 2
  • 10
  • 1
    @keltar I have recently played around with a cellular automata simulation where i repeatedly redraw all pixels in a 128x128 grid/texture (500 iterations). With the default metal rendering i land on around 15 fps, with the opengl-hint above i get around 210! – krueger Mar 16 '20 at 19:12