I am trying to create a simple 3D rendering engine in pygame (I know, not the fastest program to use but I want to at least try), and I used this to project the 3D coordinates to the 2D screen.
Next, I am using ChiliTomatoNoodle's great tutorial (although slightly adapted) to turn 3 vertices into a solid, filled triangle using horizontal (scanline) rasterization, and then my own implementation of a zBuffer.
The slowest and most problematic part is the for x in range(lineLength) loop at the bottom where I try to place values into the Numpy Array:
# Draw a triangle with a flat top with horizontal rasterization
def drawFlatBottom(p1, p2, p3, col):
global pxarray
global zbuffer
# Calculate slope of line (Is RUN/RISE, not other way around to prevent a slope of infinity)
try:
m1 = (p2[0] - p1[0]) / (p2[1] - p1[1]) # I may be doing this wrong, but this is a hotfix for now to avoid divide by 0 errors
except ZeroDivisionError:
m1 = 0
try:
m2 = (p3[0] - p1[0]) / (p3[1] - p1[1])
except ZeroDivisionError:
m2 = 0
# Calculate the z slope. Again, is RUN/RISE
try:
mz1 = (p2[2] - p1[2]) / (p2[1] - p1[1])
except ZeroDivisionError:
mz1 = 0
try:
mz2 = (p3[2] - p1[2]) / (p3[1] - p1[1])
except ZeroDivisionError:
mz2 = 0
yStart = int(p1[1])
yEnd = int(p3[1])
if yEnd > height:
yEnd = height
if yStart < 0:
yStart = 0
# Repeat for each row
y = yStart
while y < yEnd:
# Get the x positions where it intercepts with the edge
xStart = int(m1 * (y - p1[1]) + p1[0])
xEnd = int(m2 * (y - p1[1]) + p1[0])
if xEnd > width:
xEnd = width
if xStart < 0:
xStart = 0
lineLength = xEnd - xStart
# Do the same with the z positions
zStart = mz1 * (y - p1[1]) + p1[2]
zEnd = mz2 * (y - p1[1]) + p1[2]
# Find the new slope, RUN/RISE
try:
mz = (zEnd - zStart) / lineLength
except ZeroDivisionError:
mz = 0
z = zStart
# Fill each pixel
for x in range(lineLength):
# Check if the pixel in the z buffer is further away than this pixel
if zbuffer[x + xStart][y] > z:
# Write the new value to the depth buffer
zbuffer[x + xStart][y] = z
# Draw the pixel
pxarray[x + xStart][y] = col
z += mz
y += 1
In the function drawFilledTop(), p1
, p2
and p3
are all lists in the format [x, y, z]
in screen space, and are the 3 vertices of the triangle. col
represents an RGB value in the form of (255, 255, 255)
.
pxarray
is a numpy array, which gets declared at the start of the frame, before the call of the function drawFlatTop() with:
pxarray = pygame.PixelArray(win)
This then gets closed with:
pygame.PixelArray.close(pxarray)
The zbuffer
is exactly the same as the pxarray, just a 600 by 480 array with values of 999 to ensure the first triangles get drawn:
zbuffer = np.full((width, height), 999, dtype=float)
The Problem
All this works fine, except that it is extremely slow. I would expect it to be kinda slow as I am iterating over each pixel, but I wonder how can triple A games can get such high framerates with heaps of lighting and shaders and such, yet this simple scene with 4 triangles can barely reach like 30fps
I used pprofile to generate a report of how long a single frame takes, and it shows that my function to draw all 4 triangles takes 92.44% of the total runtime, or 0.316672 seconds of a total runtime of 0.342588.
Here is the actual drawing of a triangle (This is all the flat bottom triangles, drawFlatTop looks similar):
364| 0| 0| 0| 0.00%| # Fill each pixel
365| 11890| 0.0352542| 2.96503e-06| 10.29%| for x in range(lineLength):
366| 0| 0| 0| 0.00%| # Check if the pixel in the z buffer is further away than this pixel
367| 11673| 0.0446463| 3.82475e-06| 13.03%| if zbuffer[x + xStart][y] > z:
368| 0| 0| 0| 0.00%| # Write the new value to the depth buffer
369| 11673| 0.0573983| 4.91719e-06| 16.75%| zbuffer[x + xStart][y] = z
370| 0| 0| 0| 0.00%|
371| 0| 0| 0| 0.00%| # Draw the pixel
372| 11673| 0.0462677| 3.96366e-06| 13.51%| pxarray[x + xStart][y] = col
373| 0| 0| 0| 0.00%|
374| 11673| 0.0377977| 3.23804e-06| 11.03%| z += mz
The Question
What's the best way to draw a bunch of pixels to the screen/place values in an array? I don't want such a massive amount of runtime to be dedicated to just drawing pixels to the screen!
Thank you very much for any help you can provide :)
EDIT: Those measurements were taken with a triangle taking up a small portion of the screen. Now, if the screen is even half covered with pixels it slows to ridiculous framerates, and with a screen mostly covered it literally cannot even reach 1 FPS.
I am willing to try just about anything to optimise this drawing, as I intend to do a basic 3D game on an Arduino, with a much slower clock speed than my computer.
Is there some faster way to loop through pixels? Could I even write something in assembly to maximise efficiency? Does python allow that/would it be practical? Thanks again for any help