1

I am trying to create a simple 3D rendering engine in pygame (I know, not the fastest program to use but I want to at least try), and I used this to project the 3D coordinates to the 2D screen.

Next, I am using ChiliTomatoNoodle's great tutorial (although slightly adapted) to turn 3 vertices into a solid, filled triangle using horizontal (scanline) rasterization, and then my own implementation of a zBuffer.

The slowest and most problematic part is the for x in range(lineLength) loop at the bottom where I try to place values into the Numpy Array:

# Draw a triangle with a flat top with horizontal rasterization
def drawFlatBottom(p1, p2, p3, col):
    global pxarray
    global zbuffer

    # Calculate slope of line (Is RUN/RISE, not other way around to prevent a slope of infinity)
    try:
        m1 = (p2[0] - p1[0]) / (p2[1] - p1[1]) # I may be doing this wrong, but this is a hotfix for now to avoid divide by 0 errors
    except ZeroDivisionError:
        m1 = 0
    try:
        m2 = (p3[0] - p1[0]) / (p3[1] - p1[1])
    except ZeroDivisionError:
        m2 = 0

    # Calculate the z slope. Again, is RUN/RISE
    try:
        mz1 = (p2[2] - p1[2]) / (p2[1] - p1[1])
    except ZeroDivisionError:
        mz1 = 0
    try:
        mz2 = (p3[2] - p1[2]) / (p3[1] - p1[1])
    except ZeroDivisionError:
        mz2 = 0

    yStart = int(p1[1])
    yEnd = int(p3[1])

    if yEnd > height:
        yEnd = height
    if yStart < 0:
        yStart = 0

    # Repeat for each row
    y = yStart
    while y < yEnd:
        # Get the x positions where it intercepts with the edge
        xStart = int(m1 * (y - p1[1]) + p1[0])
        xEnd = int(m2 * (y - p1[1]) + p1[0])

        if xEnd > width:
            xEnd = width
        if xStart < 0:
            xStart = 0

        lineLength = xEnd - xStart

        # Do the same with the z positions
        zStart = mz1 * (y - p1[1]) + p1[2]
        zEnd = mz2 * (y - p1[1]) + p1[2]

        # Find the new slope, RUN/RISE
        try:
            mz = (zEnd - zStart) / lineLength
        except ZeroDivisionError:
            mz = 0
        z = zStart

        # Fill each pixel
        for x in range(lineLength):
            # Check if the pixel in the z buffer is further away than this pixel
            if zbuffer[x + xStart][y] > z:
                # Write the new value to the depth buffer
                zbuffer[x + xStart][y] = z

                # Draw the pixel
                pxarray[x + xStart][y] = col

            z += mz

        y += 1

In the function drawFilledTop(), p1, p2 and p3 are all lists in the format [x, y, z] in screen space, and are the 3 vertices of the triangle. col represents an RGB value in the form of (255, 255, 255).

pxarray is a numpy array, which gets declared at the start of the frame, before the call of the function drawFlatTop() with:

pxarray = pygame.PixelArray(win)

This then gets closed with:

pygame.PixelArray.close(pxarray)

The zbuffer is exactly the same as the pxarray, just a 600 by 480 array with values of 999 to ensure the first triangles get drawn:

zbuffer = np.full((width, height), 999, dtype=float)

The Problem

All this works fine, except that it is extremely slow. I would expect it to be kinda slow as I am iterating over each pixel, but I wonder how can triple A games can get such high framerates with heaps of lighting and shaders and such, yet this simple scene with 4 triangles can barely reach like 30fps

I used pprofile to generate a report of how long a single frame takes, and it shows that my function to draw all 4 triangles takes 92.44% of the total runtime, or 0.316672 seconds of a total runtime of 0.342588.

Here is the actual drawing of a triangle (This is all the flat bottom triangles, drawFlatTop looks similar):

364|         0|            0|            0|  0.00%|        # Fill each pixel
365|     11890|    0.0352542|  2.96503e-06| 10.29%|        for x in range(lineLength):
366|         0|            0|            0|  0.00%|            # Check if the pixel in the z buffer is further away than this pixel
367|     11673|    0.0446463|  3.82475e-06| 13.03%|            if zbuffer[x + xStart][y] > z:
368|         0|            0|            0|  0.00%|                # Write the new value to the depth buffer
369|     11673|    0.0573983|  4.91719e-06| 16.75%|                zbuffer[x + xStart][y] = z
370|         0|            0|            0|  0.00%|
371|         0|            0|            0|  0.00%|                # Draw the pixel
372|     11673|    0.0462677|  3.96366e-06| 13.51%|                pxarray[x + xStart][y] = col
373|         0|            0|            0|  0.00%|
374|     11673|    0.0377977|  3.23804e-06| 11.03%|            z += mz

The Question

What's the best way to draw a bunch of pixels to the screen/place values in an array? I don't want such a massive amount of runtime to be dedicated to just drawing pixels to the screen!

Thank you very much for any help you can provide :)

EDIT: Those measurements were taken with a triangle taking up a small portion of the screen. Now, if the screen is even half covered with pixels it slows to ridiculous framerates, and with a screen mostly covered it literally cannot even reach 1 FPS.

I am willing to try just about anything to optimise this drawing, as I intend to do a basic 3D game on an Arduino, with a much slower clock speed than my computer.

Is there some faster way to loop through pixels? Could I even write something in assembly to maximise efficiency? Does python allow that/would it be practical? Thanks again for any help

Resistnz
  • 74
  • 5
  • 2
    *"How can triple A games can get such high framerates"* - They do all this an the GPU. They use game engines like [unity](https://unity.com/) or [unreal](https://www.unrealengine.com/en-US/). This engines use [Vulkan](https://vulkan.lunarg.com/doc/sdk/1.2.135.0/windows/getting_started.html), [OpenGL (ES)](https://www.khronos.org/registry/OpenGL/index_gl.php), [DirectX](https://de.wikipedia.org/wiki/DirectX) etc. dependent on the system. – Rabbid76 Oct 08 '20 at 15:48
  • Of course! Didn't even think of that. Is there any way to get faster speeds with just CPU power though? I'd like to leave out dedicated GPU as ultimately I hope to develop something basic running on an Arduino, and I doubt there's any GPU on that – Resistnz Oct 09 '20 at 07:12

0 Answers0