Questions tagged [rdtsc]

RDTSC is the x86 read time stamp counter instruction.

RDTSC is the x86 read time stamp counter instruction often used for high resolution timing.

See How to Benchmark Code Execution Times on Intel® IA-32 and IA-64 Instruction Set Architectures.

Get CPU cycle count? has info on various caveats of using it: on modern x86, it measures reference cycles, not actual core clock cycles. (And also shows how to access it from C++.)

The earliest CPUs to support RDTSC had fixed clock frequency, and some OSes found it was more useful as a low-overhead time source time-of-day functions, so CPU vendors eventually changed it to be how it is now: a fixed-frequency nonstop counter.

It can be out-of-sync across different cores. (Some CPUs avoid that for cores in the same physical package.)

137 questions
5
votes
2 answers

rdtsc's return value is _always_ mod 10 == 0 on Atom N450

On my E8200 box this doesn't occur, but on my Atom N450 netbook (both running OpenSuse 11.2), whenever I read the CPU's TSC, the returned value is mod 10 == 0, i. e. it is without remainder divisible by 10. I'm using the RDTSC value for measuring…
4
votes
5 answers

Why does the first printf take longer?

I was playing around with high precision timers and one of my first tests was to use rdtsc to measure printf. Below is my test prpgram followed by its output. The thing I noticed is that the first time printf runs, it consistently takes about 25…
Alex
  • 14,973
  • 13
  • 59
  • 94
4
votes
1 answer

How to stop VC++ compiler from reordering code?

I have a code like that: const uint64_t tsc = __rdtsc(); const __m128 res = computeSomethingExpensive(); const uint64_t elapsed = __rdtsc() - tsc; printf( "%" PRIu64 " cycles", elapsed ); In release builds, this prints garbage like “38 cycles”…
Soonts
  • 20,079
  • 9
  • 57
  • 130
4
votes
1 answer

How stable is TSC (TimeStamp Counter) from user space for Intel x86-64 CPUs in 2020?

Some times I need a proper way to measure performance at nanosecond from my user space application in order to include the syscall delays in my measurement. I read many old (10yo) articles saying it isn't any stable and they are gonna remove it from…
Alexis
  • 2,136
  • 2
  • 19
  • 47
4
votes
1 answer

Multiple nop instructions do not consistently take longer than a single nop instruction

I am timing multiple NOP instructions and a single NOP instruction in C++, using rdtsc. However, I don't get an increase in the number of cycles it takes to execute NOPs in proportion to the number of NOPs executed. I'm confused as to why this is…
fraiser
  • 929
  • 12
  • 28
4
votes
1 answer

Detect Time-Stamp Counter Restriction or Availability

I want to check if the RDTSC instruction is available. There must be a Intel Pentium or newer processor and either the TSD flag in register CR4 is clear or it is set and the CPL equals 0. So, there's no problem to obtain the current privilege level…
0xbadf00d
  • 17,405
  • 15
  • 67
  • 107
4
votes
0 answers

Convert CPU cycles into seconds in C programming

I was trying to figure out if there is any easy method to convert the CPU cycles obtained in C using rdtsc() function into time in seconds. ex:- unsigned long long start, stop; start = rdtsc(); stop = rdtsc(); printf(" CPU CYCLES: %llu\n",…
nk134
  • 43
  • 1
  • 7
4
votes
4 answers

How to detect if RDTSC returns a constant rate counter value?

It seems most newer CPUs from both AMD and Intel implement rdtsc as a constant rate counter, avoiding the issues caused by frequency changing as a result of things like TurboBoost or power saving settings. As rdtsc is a lot more suitable for…
Suma
  • 33,181
  • 16
  • 123
  • 191
4
votes
1 answer

How to ensure that RDTSC is accurate?

I've read that RDTSC can gives false readings and should not be relied upon. Is this true and if so what can be done about it?
Johan
  • 74,508
  • 24
  • 191
  • 319
4
votes
2 answers

Assembler instruction: rdtsc

Could someone help me understand the assembler given in https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html It goes like this: uint64_t msr; asm volatile ( "rdtsc\n\t" // Returns the time in EDX:EAX. "shl $32, %%rdx\n\t" // Shift…
Leta
  • 331
  • 1
  • 4
  • 16
4
votes
2 answers

How to benchmark on multi-core processors

I am looking for ways to perform micro-benchmarks on multi-core processors. Context: At about the same time desktop processors introduced out-of-order execution that made performance hard to predict, they, perhaps not coincidentally, also introduced…
Pascal Cuoq
  • 79,187
  • 7
  • 161
  • 281
4
votes
4 answers

Is there a cheaper serializing instruction than cpuid?

I have seen the related question including here and here, but it seems that the only instruction ever mentioned for serializing rdtsc is cpuid. Unfortunately, cpuid takes roughly 1000 cycles on my system, so I am wondering if anyone knows of a…
merlin2011
  • 71,677
  • 44
  • 195
  • 329
3
votes
1 answer

cpp linux: about rdtsc

I'm using the following function in my code: static __inline__ unsigned long long rdtsc(void){ unsigned long long int x; __asm__ volatile (".byte 0x0f, 0x31" : "=A" (x)); return x; } Does this function return number of ticks since last…
kakush
  • 3,334
  • 14
  • 47
  • 68
3
votes
1 answer

Failed to reproduce the high-precision time measuring kernel module from intel's white paper

I am trying to reproduce How to Benchmark Code Execution Times on Intel IA-32 and IA-64 Instruction Set Architectures White Paper. This white paper provides a kernel module to accurately measure the execution time of a piece of code, by disabling…
Bin Yan
  • 117
  • 8
3
votes
1 answer

Can constant non-invariant tsc change frequency across cpu states?

I used to benchmark Linux System Calls with rdtsc to get the counter difference before and after the system call. I interpreted the result as wall clock timer since TSC increments at constant rate and does not stop when entering halt state. The…
Some Name
  • 8,555
  • 5
  • 27
  • 77
1 2
3
9 10