When I use the following code:
#define MAX_RADIUS 55
#define KERNEL_SIZE (MAX_RADIUS * 2 + 1)
...
float[] kernel[KERNEL_RADIUS];
...
float4 PS_GaussianBlur(float2 texCoord : TEXCOORD) : COLOR0
{
float4 color = float4(0.0f, 0.0f, 0.0f, 0.0f);
//add the right side offset pixels to the color
for (int i = 0; i < MAX_RADIUS; i++)
{
if(kernel[i] != 0) //this will improve performance for lower filter radius's, but increases const register num
color += tex2D(colorMap, texCoord + offsets[i]) * kernel[i];
}
//add the left side offset pixels to the color
for (int j = 0; j < MAX_RADIUS; j++)
{
if(kernel[i] != 0)
color += tex2D(colorMap, texCoord - offsets[j]) * kernel[j];
}
//finally add the weight of the original pixel to the color
color += tex2D(colorMap, texCoord) * kernel[MAX_RADIUS];
return color;
}
The if(kernel[i] != 0)
increases the number of instructions used dramatically!
So my question is this: What increases instruction count? And why would using an if statement increase instruction count by over 400 in a loop that is only 110 instructions long?
EDIT: Above question edited. I mistakenly thought registers were being taken when it was really instructions. However, the question still applies. What would cause 2 for loops (of length 55 each) to increase the instruction count by over 400 with just 1 added if statement within the loop?