Self-modifying code for trace hooks?

Question

I'm looking for the least-overhead way of inserting trace/logging hooks into some very performance-sensitive driver code. This logging stuff has to always be compiled in, but most of the time do nothing (but do nothing very fast).

There isn't anything much simpler than just having a global on/off word, doing an if(enabled){log()}. However, if possible I'd like to even avoid the cost of loading that word every time I hit one of my hooks. It occurs to me that I could potentially use self-modifying code for this -- i.e. everywhere I have a call to my trace function, I overwrite the jump with a NOP when I want to disable the hooks, and replace the jump when I want to enable them.

A quick google doesn't turn up any prior art on this -- has anyone done it? Is it feasible, are there any major stumbling blocks that I'm not foreseeing?

(Linux, x86_64)

Watch out for the possibility of write/execute mode exclusivity. It can make writing self modifying code rather harder... — dmckee --- ex-moderator kitten, Dec 29 '10 at 01:25

score 4 · Answer 1 · answered Dec 29 '10 at 01:33

4

Does it matter if your compiled driver is suddenly twice as large?

Build two code paths -- one with logging, one without. Use a global function pointer(s) to jump into the performance-sensitive section(s), overwrite them as appropriate.

answered Dec 29 '10 at 01:33

Nicholas Knight

15,774
5
45
57

score 4 · Accepted Answer · answered Dec 29 '10 at 04:31

Yes, this technique has been implemented within the Linux kernel, for exactly the same purpose (tracing hooks).

See the LWN article on Jump Labels for a starting point.

There's not really any major stumbling blocks, but a few minor ones: multithreaded processes (you will have to stop all other threads while you're enabling or disabling the code); incoherent instruction cache (you'll need to ensure the I-cache is flushed, on every core).

Thanks, this is precisely what I was looking for. – kdt Dec 29 '10 at 10:22 — kdt, Dec 29 '10 at 10:22

score 0 · Answer 3 · answered Dec 29 '10 at 01:26

0

If there were a way to somehow declare a register global, you could load the register with the value of your word at every entry point into your driver from the outside and then just check the register. Of course, then you'd be denying the use of that register to the optimizer, which might have some unpleasant performance consequences.

answered Dec 29 '10 at 01:26

Omnifarious

54,333
19
131
194

You can't declare a global variable with the `register` keyword, and even if you could, it would be a horrible, horrible idea to lose that register for everything else. – Adam Rosenfield Dec 29 '10 at 03:38
@Adam Rosenfield - There are actually ways to declare register globals with some C compilers. Particularly embedded ones. And yes, losing it for everything else would be very bad. I just threw out a valid idea and pointed out its shortcomings. I don't think that deserves downvoting, but whatever. I'm leaving my answer here regardless. – Omnifarious Dec 29 '10 at 03:46
This would be a reasonable approach on an embedded architecture, but unfortunately I need to operate on at least x86_64, and ideally portably. I don't understand why you were downvoted either... – kdt Dec 29 '10 at 09:58

score 0 · Answer 4 · answered Jan 24 '11 at 03:32

0

I'm writing not so much on the issue of whether this is possible or not but if you gain anything significant.

On the one hand you don't want to test "logging enabled" every time a logging possibility presents itself and on the other need to test "logging enabled" and overwrite code with either the yes- or the no-case code. Or does your driver "remember" that it was no the last time and since no is requested this time nothing needs to be done?

The logic necessary does not appear to be trivial compared to testing every time.

answered Jan 24 '11 at 03:32

Olof Forshell

3,169
22
28

you're misunderstanding. When using re-written hooks, the only time work is done is when tracing is turned on (insert trace code) or turned off (replace with do-nothing code). The whole point is that when tracing is turned off, the trace points become no-ops. – kdt Jan 24 '11 at 11:29
What I meant was that the logic and code necessary to either insert trace code or replace with do-nothing code might well nullify the time gained by executing no-ops. – Olof Forshell Jan 25 '11 at 03:10

Self-modifying code for trace hooks?

4 Answers4