cpu time jumps a lot in virtual machine

Question

I have a C++ program running with 20 threads (boost threads) on one of the RHEL6.5 systems virtualized in dell server. The result is deterministic, but the cpu time and wall time varies a lot in different runs. Sometimes, it takes 200s cpu time to finish, sometimes it may take up to 300s cpu time to finish. This bothers me as performance is a criterion for our testing.

I've changed the originally used boost::timer::cpu_timer for wall/cpu time calc and use sys apis 'clock_gettime' and 'getrusage'. It doesn't help.

Is it because of the 'steal time' by hypervisor (Vmware)? Is steal time included in the user/sys time collected by 'getrusage'?

Anyone have knowledge on this? Many Thanks.

score 0 · Answer 1 · answered Mar 22 '19 at 07:48

It would be useful if you provided some extra information. For example are your threads dependent? meaning is there any synchronization going among them?

Since you are using a virtual machine, how is your CPU shared with other users of the server. It might be that even the same single CPU core is shared, thus not each time you have the same allocation of CPU resources [this is the steal time you mention above].

Also you mention that CPU time is different: this is the time spent in user code. If you have sync among threads (such as a mutex, etc) then depending on how operating system wakes up threads etc, the over all time might vary.

Not much synchronization, but new threads are frequently launched to do fine-grained jobs and killed. Even though, aren't the thread efforts catogoried into sys cpu time rather than user cpu time? 'getusage' should distinguish them. >50% cpu effort on threads new/kill does not make sense at all. For the virtual machines, it's LPAR. In this case, is it possible to rule out the steal time from collected user cpu time? How can I config it in programmatic way? Thanks! — Novak, Mar 22 '19 at 10:52

cpu time jumps a lot in virtual machine

1 Answers1