Some details depend on the actual GPU architecture. But a simplified example, in addition to the answer that Trudbert already gave (+1) :
For a branch like this
if (data[threadIndex] > 0.5) {
data[threadIndex] = 1.0;
}
there may be a set of threads for which the statement is true
, and another set of threads for which the statement is false
. One can imagine it as if the threads for which the statement is false
simply wait until the others have finished their work.
Analogously, for a branch like this
if (data[threadIndex] > 0.5) {
data[threadIndex] = 1.0;
} else {
data[threadIndex] = 0.0;
}
one can imagine this as all threads executing both paths of the branch, and making sure that the results from the "wrong" path are ignored. This is referred to as "predicated execution".
(More detailed information about this can be found in GPU Gems 2, Chapter 34)
So since there is no advantage in predicting the "right" branch (because every thread has to take all branches anyhow), there is no reason to introduce branch prediction.