1

I'm trying to measure getrusage resolution via simple program:

#include <cstdio>
#include <sys/time.h>
#include <sys/resource.h>
#include <cassert>

int main(int argc, const char *argv[]) {
    struct rusage u = {0};
    assert(!getrusage(RUSAGE_SELF, &u));
    size_t  cnt = 0;
    while(true) {
        ++cnt;
        struct rusage uz = {0};
        assert(!getrusage(RUSAGE_SELF, &uz));
        if(u.ru_utime.tv_sec != uz.ru_utime.tv_sec || u.ru_utime.tv_usec != uz.ru_utime.tv_usec) {
            std::printf("u:%ld.%06ld\tuz:%ld.%06ld\tcnt:%ld\n", 
                    u.ru_utime.tv_sec, u.ru_utime.tv_usec,
                    uz.ru_utime.tv_sec, uz.ru_utime.tv_usec,
                    cnt);
            break;
        }
    }
}

And when I run it, I usually get output similar to the following:

ema@scv:~/tmp/getrusage$ ./gt
u:0.000562  uz:0.000563 cnt:1
ema@scv:~/tmp/getrusage$ ./gt
u:0.000553  uz:0.000554 cnt:1
ema@scv:~/tmp/getrusage$ ./gt
u:0.000496  uz:0.000497 cnt:1
ema@scv:~/tmp/getrusage$ ./gt
u:0.000475  uz:0.000476 cnt:1

Which seems to hint that the resolution of getrusage is around 1 microsecond. I thought it should be around 1 / getconf CLK_TCK (i.e. 100hz, hence 10 millisecond).

What is the true getrusage resolution?
Am I doing anything wrong?

Ps. Running this on Ubuntu 20.04, Linux scv 5.13.0-52-generic #59~20.04.1-Ubuntu SMP Thu Jun 16 21:21:28 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux, 5950x.

Emanuele
  • 1,408
  • 1
  • 15
  • 39
  • Just because system clocks tick at a certain frequency does not mean that the kernel only logs usage at that frequency. If a process attempts to `read()` an empty socket, and has nothing to do in a middle of at tick the kernel is not going to do nothing with the process's CPU, and find something better to do, instead. – Sam Varshavchik Jul 11 '22 at 11:10
  • @SamVarshavchik Not quite sure what you mean. getrusage is not supposed to return the 'wall' time, but the 'user' and 'system' time a process (or thread) uses, and according to my knowledge the minimum unit should be 1/CLK_TCK, i.e. 10000 usec. – Emanuele Jul 11 '22 at 11:13
  • I see nothing in getrusage's manual page that claims that. The point is that if after running for five microseconds a thread gets blocked and needs to sleep it makes no sense for the kernel to ding it for a full clock tick's worth of CPU time, instead of only five microseconds. – Sam Varshavchik Jul 11 '22 at 11:43
  • Fair enough - Then it can be such good resolution instead of coarse 1/100 s? Asking because on Windows the equivalent GetThreadTimes is 10 or 15 ms minimum time slice, hence it can't go under that (in fact just tested the similar code on windows and indeed it gives 15.6 ms). I thought getrusage on Linux would behave the same. – Emanuele Jul 11 '22 at 12:29
  • 1
    I find nothing in the manual pages that specifies the minimum tick interval. I would not be surprised to learn that over the entire history of the Linux kernel there were some variations in this area. – Sam Varshavchik Jul 11 '22 at 12:31
  • @SamVarshavchik Want to put together your reasoning in an answer which I would upvote? – Emanuele Jul 14 '22 at 21:52

1 Answers1

1

The publicly defined tick interval is nothing more than a common reference point for the default time-slice that each process gets to run. When its tick expires the process loses its assigned CPU which then begins executing some other task, which is given another tick-long timeslice to run.

But that does not guarantee that a given process will run for its full tick. If a process attempts to read() an empty socket, and has nothing to do in a middle of a tick the kernel is not going to do nothing with the process's CPU, and find something better to do, instead. The kernel knows exactly how long the process ran for, and there is no reason whatsoever why the actual running time of the process cannot be recorded in its usage statistics, especially if the clock reference used for measuring process execution time can offer much more granularity than the tick interval.

Finally the modern Linux kernel can be configured to not even use tick intervals, in specific situations, and its advertised tick interval is mostly academic.

Sam Varshavchik
  • 114,536
  • 5
  • 94
  • 148
  • Recently had a brief look at the sources of the kernel, found that the _real_ counters are stored in task_struct https://elixir.bootlin.com/linux/latest/source/include/linux/sched.h#L1015 and are u64... wondering at what resolution those are kept? – Emanuele Aug 04 '23 at 16:46