I am writing GPU code for a problem and I am trying to avoid branching (because I know it is very bad for the GPU)
new_w = w * 0.8;
w = some_number;
h = some_number;
for (int i = 0; i < num_rectangles; i++)
{
// for each 2d vector, get the x, y
cx = abs(array_of_2d_points[i * 2]);
cy = abs(array_of_2d_points[i * 2 + 1]);
// check if the 2d vector is inside the w, h
if (cx < w / 2 && cy < h / 2)
{
// if it is inside the rectangle; update the new_w (which was previously set to 80% of the w)
new_w = max(cx * 2, new_w);
}
}
Is there a more clever way to avoid the branch here?
One way that I can think of is to cast the bool into an int
int is_inside = cx < w / 2 && cy < h / 2;
new_w = max(cx * 2 * is_inside, new_w);
But does it actually avoid branching? Does merely using <
cause the GPU to branch?
I tried the above approach of casting bool to int. The speed was roughly the same