In linux, the gcc compiler has the intrinsic function __rdtsc
to measure the cpu cycles. So I don't need to use inline asm code, which I am not familiar with.
On the other hand, when reading posts about the asm code, I saw people saying that rdtsc
instruction should be used in combination with cpuid
or other fences to flush the pipeline, like this one.
My question is: without using asm
code, what is the proper way to flush the pipeline for linux x64 in order to measure cycles correctly?
Moreover, there is also __rdtscp
. Does this function already flushed the pipeline so it can be used to replace the flush + __rdtsc
?