Variable event count based sampling using perf

Question

I am trying to read the PMU event counters whenever a particular event counter overflows using perf. I know that perf works with fixed sample period. What i am looking for is the possibility to read PMU counters each time with a different sample period within an application.? Yes, i have a requirement that demands variable sampling period. E.g If an application has 1000 instructions. I want to read PMU event counters at 200, 150, 300, 50, 150, 100, 50 instructions. Any directions would be really helpful.

The hardware is based on raising an interrupt (or for PEBS, recording a sample in a buffer) when a down-counter reaches zero, then resetting the counter. For non-PEBS, I think it might in theory be possible to reset the counter to a different value in the interrupt handler, but this of course has *huge* amount of overhead for anything like a 100-instruction interval. — Peter Cordes, Nov 17 '19 at 21:18
IDK if PEBS can sample other counters when one triggers; if so you could maybe sample a precise `instructions` counter on the GCD of your sequence (50) and later combine counts between chunks. But given OoO exec, I wouldn't count on it being perfectly meaningful for attributing other events to specific instructions. — Peter Cordes, Nov 17 '19 at 21:20
Well, i dont need at such small instruction interval. Maybe say 15-20 samples(~every sample is a different multiple of 100,000 instructions) for a 1 second execution time of an application. — PHP, Nov 18 '19 at 10:10
Oh, well that's a completely different story. You're not aiming for super-fine granularity after all. IIRC it is possible to ask `perf` to sample counters for other events when one fires, or at least you can do that with PAPI calling the library functions manually. — Peter Cordes, Nov 18 '19 at 10:22
Yeah , you are right I am not looking at super-fine granularity. Its pretty straightforward with perf if the sampling period is fixed. But how do i reset the sampling period to a different value after every sample is recorded? IMO I probably have to do that at kernel level using perf. — PHP, Nov 18 '19 at 10:32
You don't need to if you can post-process the recorded output to combine samples from adjacent bins when you want to treat that as one larger window. If 100k is low enough overhead then it's probably best to let the usual machinery do its thing. — Peter Cordes, Nov 18 '19 at 10:34
Its a good idea. But then i wouldn't know how to recombine LLC misses, cycles and other PMU events. They are entirely dependent on the workload and interference from other cores. — PHP, Nov 18 '19 at 11:47
That's why you sample *all* the counters on every *instructions* event so each group of samples is associated with a certain group of instructions (modulo OoO exec and memory parallelism / HW prefetch). I think I've seen an SO question about getting `perf` to do that, but I forget how. Maybe only with custom code + PAPI. — Peter Cordes, Nov 18 '19 at 12:01
Or is there a way in perf to limit the number of samples i collect. This way i could incrementally filter out the region of code that i want to do performance measurement on.? — PHP, Nov 20 '19 at 17:06

Variable event count based sampling using perf

0 Answers0

Linked