During discussion developer informed that
- likely/unlikely gcc optimization
- placing most common branch first in code
have no effect and should be ignored on Intel processors. The stated reason is dynamic branch prediction employed by Intel. I have 2 questions, I could not find explicit answer:
- Is branch prediction data global for the processor(core) or it is per process?
- If it is per process. Are Branch target buffer with results saved during entire process existence or is it flashed when process used it's timeslice and instruction cache got flashed or it moved to another core?
Assumptions:
- Linux
- Skylake Intel processor
- Separate several processes run on a core.