I'm looking to improve interface destruction time with FreeBSD. Destroying thousands of interfaces takes several minutes on my test machine running -CURRENT
, and while -- admittedly -- my use case may be an unusual one, I'd like to understand what's taking the system so long.
From my initial observations, I was able to establish that most of the time is spent waiting somewhere inside if_detach_internal()
. So in an attempt to profile this function, I came up with the following DTrace script:
#!/usr/sbin/dtrace -s
#pragma D option quiet
#pragma D option dynvarsize=256m
fbt:kernel:if_detach_internal:entry
{
self->traceme = 1;
t[probefunc] = timestamp;
}
fbt:kernel:if_detach_internal:return
{
dt = timestamp - t[probefunc];
@ft[probefunc] = sum(dt);
t[probefunc] = 0;
self->traceme = 0;
}
fbt:kernel::entry
/self->traceme/
{
t[probefunc] = timestamp;
}
fbt:kernel::return
/self->traceme/
{
dt = timestamp - t[probefunc];
@ft[probefunc] = sum(dt);
t[probefunc] = 0;
}
By hooking to the entry
and return
fbt probes, I'm expecting to get a list of function names and cumulative execution times for every function called by if_detach_internal()
(no matter the stack depth), and filter out anything else.
What I'm getting, however, looks like this (destroying 250 interfaces):
callout_when 1676 sched_load 1779 if_rele 1801 [...] rt_unlinkrte 10296062843 sched_switch 10408456866 rt_checkdelroute 11562396547 rn_walktree 12404143265 rib_walk_del 12553013469 if_detach_internal 24335505097 uma_zfree_arg 25045046322788 intr_event_schedule_thread 58336370701120 swi_sched 83355263713937 spinlock_enter 116681093870088 [...] spinlock_exit 4492719328120735 cpu_search_lowest 16750701670277714
Timing information for at least some of the functions seems to make sense, but I would expect if_detach_internal()
to be the last entry in the list, with nothing taking longer than that, since this function is at the top of the call tree I'm trying to profile.
Clearly, it is not the case, as I'm also getting measurements for other functions (uma_zfree_arg()
, swi_sched()
, etc...) with seemingly crazy execution times. These results completely annihilate my trust in everything else DTrace tells me here.
What am I missing ? It this approach sound at all ?