This shows up in Process Explorer where I am running Python scripts related to the Collatz conjecture. To avoid distraction by the complexity of these scripts, I have created this minimal script to demonstrate the problem:-
import time
import psutil
import random
p = psutil.Process()
p.cpu_affinity([4]) # Lock to a CPU to eliminate this as a cause of variation.
print('cpu affinity', p.cpu_affinity())
random.seed(123456789)
bits = 7091330
n = random.getrandbits(bits) # Generate a very large number.
print(bits, 'bits')
t0= time.perf_counter()
for i in range(100000):
m = 3 * n # Do something with the very large number.
t1= time.perf_counter()
print('ran in %f seconds' % (t1 - t0))
print(p.cpu_times())
mi = p.memory_info() # Get page fault statistics.
for field in mi._fields:
print(field, getattr(mi, field))
When this is run, it sometimes gets a lot of page faults and takes longer, sometimes not:-
cpu affinity [4]
7091330 bits
ran in 43.947864 seconds
pcputimes(user=41.5, system=0.21875, children_user=0.0, children_system=0.0)
rss 27992064
vms 20475904
num_page_faults 7421
peak_wset 27996160
wset 27992064
peak_paged_pool 200808
paged_pool 200632
peak_nonpaged_pool 19296
nonpaged_pool 18944
pagefile 20475904
peak_pagefile 20475904
private 20475904
cpu affinity [4]
7091330 bits
ran in 48.507077 seconds
pcputimes(user=42.03125, system=4.625, children_user=0.0, children_system=0.0)
rss 27144192
vms 20529152
num_page_faults 3034148
peak_wset 28934144
wset 27144192
peak_paged_pool 200808
paged_pool 200632
peak_nonpaged_pool 19296
nonpaged_pool 18944
pagefile 20529152
peak_pagefile 21413888
private 20529152
I am running this with Python 3.4.2 under Windows 10 on an i7-4790 CPU with 16GB of RAM. Changing the process priority has no effect. Changing the constant 3 in the for loop to another small number has no effect. Doubling the size of the number n approximately doubles the number of page faults. In each iteration of the for loop, the number of page faults increases either by zero or by about 1/30800 of the number of bits in n.
I have discovered that page faults in the loop can be eliminated if the "Do something with the very large number" is confined to in-place operations like *=, //=, etc. It may also be necessary to use the gmpy2 xmpz multiple-precision integer type for the very large number.