I'm using eBPF to trace context switches on a bare metal Ubuntu 20.04 server. For some reason, the timestamp from the traces jumps in value during the job execution, and I'm not sure why. My eBPF program to trace is as follows:
!/usr/bin/python
#
# trace context switch
import time
start_time = time.time()
from bcc import BPF
print("imports: " + str(time.time() - start_time))
# load BPF program
b = BPF(text="""
TRACEPOINT_PROBE(sched, sched_switch) {
// cat /sys/kernel/debug/tracing/events/sched/sched_switch/format
bpf_trace_printk("%d\\n", args->prev_prio);
return 0;
}
""")
print("BPF program set: " + str(time.time() - start_time))
# data structures
unique_pid_counts = {}
timeline = []
value_errors = 0
time_diffs = []
"""
Collects data from the context switch trace and stores it in the data structures unique_task_counts and timeline
"""
print("Started tracing: " + str(time.time() - start_time))
while True:
try:
# get current time
current_time = time.perf_counter()
# capture data from trace
data = b.trace_fields() # (task, pid, cpu, flags, ts, msg)
# add task to timeline
timeline.append(data)
# output
# if counter % 2000 == 0:
time_diffs.append((data, current_time - data[4]))
except ValueError:
value_errors += 1
continue
except KeyboardInterrupt:
# feedback to the user
print("Finished tracing: " + str(time.time() - start_time) + " seconds")
if value_errors:
print("value errors: " + str(value_errors))
# write the differences to a file
with open("Data/time_diffs.txt", "w") as fl:
fl.write("\n".join(str(x) for x in time_diffs))
# process the data
from data_process import tracing_data_process
tracing_data_process(timeline)
# finished tracing
print("Finished processing timeline, total run: " + str(round(time.time() - start_time, 3)) + " seconds")
exit()
I store the information from the traces in a file, part of which is displayed here. The format is ((task name, process ID, CPU number, flags, timestamp, message), python performance counter - timestamp). The two traces showing the jump are bolded:
((b'kworker/u32:0', 62419, 13, b'd...', 5032.112517, b'120'), 344.77145957000084)
((b'<idle>', 0, 13, b'd...', 5032.112689, b'120'), 344.77129126600084)
((b'kworker/u32:0', 62419, 13, b'd...', 5032.112701, b'120'), 344.77128291699955)
((b'<idle>', 0, 13, b'd...', 5032.114244, b'120'), 344.7697435989994)
((b'kworker/u32:0', 62419, 13, b'd...', 5032.114271, b'120'), 344.7697202419995)
((b'<idle>', 0, 1, b'd...', 5032.114313, b'120'), 344.76968189200034)
((b'sshd', 45772, 1, b'd...', 5032.114459, b'120'), 344.7695395889996)
((b'', 0, 14, b'd...', 5377.338941, b'120'), -0.45493316400006734)
((b'kworker/u32:4', 21613, 14, b'd...', 5377.338969, b'120'), -0.4549573809999856)
((b'<idle>', 0, 1, b'd...', 5377.33901, b'120'), -0.4549946520000958)
((b'sshd', 45772, 1, b'd...', 5377.339155, b'120'), -0.455133453999224)
((b'<idle>', 0, 5, b'd...', 5377.346264, b'120'), -0.4622384320000492)
((b'rcu_sched', 11, 5, b'd...', 5377.346278, b'120'), -0.462244209000346)
((b'<idle>', 0, 14, b'd...', 5377.346296, b'120'), -0.4622557519996917)
((b'kworker/u32:4', 21613, 14, b'd...', 5377.346316, b'120'), -0.4622718830005397)
((b'<idle>', 0, 14, b'd...', 5377.346673, b'120'), -0.46262520700020104)
I'm not sure why this behavior is occurring. It may be an issue with my code (which is why I posted on SO), or it may be an issue with Linux behavior. I would appreciate any suggestions for modifications. This code is being used for research, so I'm open to experimentation.
Update + More details: I've noticed this same timestamp jump across multiple runs and servers. Every time, the difference between the timestamp and the performance counter decreases. This jump takes a bit of time to occur, and in one case, I observed more than one jump (both times reducing the difference). After the timestamp was normalized, then I noticed minor fluctuations around the expected values. After a restart the timestamp values remain constant and expected.
Edit 2: After the computer is restarted, the timestamp behaves as expected. The behavior persists after the first run thought.