How did debuggers for 16-bit real mode programs produce stack traces?

Question

I'm messing around with running old DOS programs in an emulator, and I've gotten to the point where I'd like to trace the program's stack. However, I'm running into a problem, specifically how to detect near calls and far calls. Some pretext:

A near call pushes only the IP onto the stack, and is expected to be paired with a ret which pops only the IP to return to.
A far call pushes both the CS and IP onto the stack, and is expected to be paired with a retf which pops both the CS and IP to return to.
There is no way to know whether a call is a near call or a far call, except by knowing which kind of instruction called it, or which return it uses.

Luckily, for the period this program was developed in, BP-based stack frames were very common, so walking the stack doesn't seem to be a problem: I just follow the BP-chain. Unfortunately, getting the CS and/or IP is difficult, because there doesn't seem to be any way for me to determine whether a call is a near call or a far call by looking at the stack alone.

I have metadata about functions available, so I can tell whether a function is a near or far call if I already know the actual CS and IP, but I can't figure out the IP and CS unless I already know if it's a far call or near call.

I'm having a little success by just guessing and seeing if my guess results in a valid function lookup, but I think this method will produce a lot of false positives.

So my question is this: How did debuggers of the DOS era deal with this problem and produce stack traces? Is there some algorithm for this I'm missing, or did they just encode debug information in the stack? (If this is the case, then I'll have to come up with something else.)

If you don't get an answer here, consider joining [reverseengineering.se] and asking on there. (But don't do both, unannounced. It'll only lead to fragmenting knowledge.) — Jongware, Oct 22 '18 at 22:33
`[SP]` isn't a valid 16-bit addressing mode, and only `[BP]` as a base implies SS as a segment, so yes, using BP for access to the stack was the only good option for random access (not just push/pop for temporaries). No reason not to save/restore it first to make a conventional legacy stack-frame. — Peter Cordes, Oct 23 '18 at 02:46
Our companion site [Retro Computing](https://retrocomputing.stackexchange.com/) *might* be a better fit for this question. I used 16-bit x86 environments extensively (including for low-level code) from 1985 on, but I have no recollections how debuggers worked at the time. — njuffa, Oct 23 '18 at 03:04

score 2 · Answer 1 · answered Oct 23 '18 at 02:55

Just a guess, I've never actually used 16-bit x86 development tools (modern or back in the day):

You know the CS:IP value of the current function (or one that triggered a fault or whatever from an exception frame).

You might have metadata that tells you whether this is a "far" function that's called with a far call or not. Or you could attempt decoding until you get to a retn or retf, and use that to decide whether the return address is a near IP or a far CS:IP.

(Assuming this is a normal function that returns with some kind of ret. Or if it ends with a jmp tailcall to another function, then the return address probably matches that, but that's another level of assumptions. And figuring out that a near jmp is the end of a function instead of just a jump within a large function is am ambiguous problem without any symbol metadata.)

But anyway, apply the same thing to the parent function: after one level of successful backtracing, you now have the CS:IP of the instruction after the call in your parent function, and the SS:BP value of the BP linked list.

And BTW, yes there's a very good reason for legacy BP stack frames being widely used: [SP] isn't a valid 16-bit addressing mode, and only [BP] as a base implies SS as a segment, so yes, using BP for access to the stack was the only good option for random access (not just push/pop for temporaries). No reason not to save/restore it first (before any other registers or reserving stack space) to make a conventional stack-frame.

How did debuggers for 16-bit real mode programs produce stack traces?

1 Answers1