What does it mean to "train" a branch predictor?

Question

I was reading this article about a theoretical CPU vulnerability similar to Spectre, and it noted that:

"The attacker needs to train the branch predictor such that it reliably mispredicts the branch."

I roughly understand what branch prediction is and how it works, but what does it mean to "train" a branch predictor? Does this mean biasing one branch such that it is much more computationally expensive than the other, or does it mean to (in a loop) continually have the CPU to correctly predict a particular branch before proceeding to the next, mispredicted branch?

E.g.,

// Train branch predictor
for (int i = 0; i < 512; i++)
{
    if (true){
        // Do some instructions
    } else {
        // Do some other instruction
    }
}

// The branch predictor is now "trained"/biased to predict the first branch?

// Proceed to attack

Do the branch predictors use weights to bias the prediction or one way or the other based on previous predictions/mispredictions?

score 4 · Answer 1 · answered Aug 08 '18 at 12:50

It means to create a branch that aliases the branch you're attacking (by putting it at a specific address, maybe the same virtual address as in another process, or a 4k or some other power of 2 offset may work), and running it a bunch of times to bias the predictor.

So that when the branch you're attacking with Spectre actually runs, it will be predicted the way you want. (Or for indirect branches, will jump to the virtual address you want).

Modern TAGE branch predictors index based on branch history (of other branches in the dynamic instruction stream leading to this branch), so properly training can be complicated...

But at the most simplistic level, yes, branch-predictors with more than 1 bit of state remember more than just the last branch direction. Wikipedia has a big article about many different implementations of branch prediction, from simple 2-level saturating counters on up.

Training them involves making a branch you control go the same way repeatedly.

Specifically, you'd put something like this asm in a loop (at a known address), and run it repeatedly.

xor   eax,eax    ; eax=0 and thus set ZF
jnz   .target    ; always not-taken

Then the target branch will fall through and run the Spectre "gadget" you want, even though it's normally taken.

score 1 · Answer 2 · answered Aug 08 '18 at 12:49

A branch predictor works by remembering recent branch targets. The simplest form of prediction simply remembers which branch was taken the last time it was hit; more complex predictors exist and are common.

The "training" is simply populating that memory. For the simple (1-value) predictor, that means taking the branch you want to favour, once. For complex predictors, it will mean executing the favoured branch multiple times until the processor reliably predicts the desired outcome.

What does it mean to "train" a branch predictor?

2 Answers2