1

A little background first: stumbling upon this blog post, I learned it was possible to create DOS .COM files with the GNU linker and it's not even rocket science. Using clang and the -m16 switch (creating real-mode compatible 32bit code by prefixing 32bit instructions accordingly), this worked out quite well. So I had the idea to try implementing just enough runtime to get a little curses game I wrote recently to compile to a .COM and run in real-mode DOS. The game is small enough so that squeezing everything (text, data, bss, heap, stack) in 64KB seemed doable. Of course, it uses malloc(). So I had to come up with my own implementation. This is what it looks like:

typedef unsigned short size_t; /* from stddef.h */

typedef struct hhdr hhdr;
struct hhdr
{
    void *next;
    int free;
};

extern char _heap;
static char *hbreak = &_heap;
static hhdr hhead = { &_heap, 0 };

static void *newchunk(size_t size)
{
    char *stack;
    __asm__("mov %%esp, %0": "=rm" (stack));
    if (hbreak + size > stack - 0x40) return 0;
    if (size < 1024) size = 1024;
    hhdr *chunk = (hhdr *)hbreak;
    hbreak += size;
    if (hbreak > stack - 0x40) hbreak = stack - 0x40;
    chunk->next = hbreak;
    chunk->free = 1;
    return chunk;
}

void *malloc(size_t size)
{
    if (!size) return 0;
    if (size % sizeof(hhdr)) size += sizeof(hhdr) - (size % sizeof(hhdr));

    hhdr *hdr = &hhead;
    while ((char *)hdr->next < hbreak)
    {
        hdr = hdr->next;
        if (hdr->free && 
                (char *)hdr->next - (char *)hdr - sizeof(hhdr) >= size)
        {
            if ((char *)hdr->next - (char *)hdr - 2*sizeof(hhdr) > size)
            {
                hhdr *hdr2 = (hhdr *)((char *)hdr + sizeof(hhdr) + size);
                hdr2->free = 1;
                hdr2->next = hdr->next;
                hdr->next = hdr2;
            }
            hdr->free = 0;
            return (char *)hdr + sizeof(hhdr);
        }
    }

    if (!(hdr->next = newchunk(size + sizeof(hhdr)))) return 0;
    return malloc(size);
}

void free(void *ptr)
{
    if (!ptr) return;
    hhdr *hdr = (hhdr *)((char *)ptr - sizeof(hhdr));
    hdr->free = 1;
    if ((void *)hdr != hhead.next)
    {
        hhdr *hdr2 = hhead.next;
        while (hdr2->next != hdr) hdr2 = hdr2->next;
        if (hdr2->free) hdr = hdr2;
    }
    hhdr *next = hdr->next;
    while ((char *)next < hbreak)
    {
        if (!next->free) break;
        hdr->next = next;
        next = next->next;
    }
    if ((char *)next == hbreak) hbreak = (char *)hdr;
}

The _heap symbol is defined by the linker. Not showing realloc() here as it isn't used right now anyways (and therefore completely untested).

The problem now is: I created my runtime here (malloc is in src/libdos/stdlib.c), wrote a lot of testing stuff and in the end, everything seemed to work quite well. My game on the other hand is thoroughly tested and checked for invalid memory accesses using valgrind. Still, putting both parts together, it just crashes. (Try building the game from git with make -f libdos.mk, you will need to have llvm/clang installed).

As I experienced a strange heisenbug first (I worked around it for now), I guess it COULD be the optimizers fault getting things wrong when compiling for real mode, which is indeed uncommon. But I can't be sure, and the next sensitive candidate would be probably my memory management, see above.

Now the tricky question: How would I debug such a thing? With just my own test code, it works very well. I can't compile my game without optimizations because it will exceed 64KB when doing so. Any suggestions? Or can anyone spot something obviously wrong with the above code?

  • Minor: Recommend `(hbreak + size > stack - 0x40)` --> `(hbreak + size +0x40 > stack)` to avoid potential underflow. Also at `(hbreak > stack - 0x40)`. – chux - Reinstate Monica Aug 05 '15 at 20:38
  • As the stack starts at `0xffff` and the code at `0x0100` for `.COM`, this would be a very b0rked situation ;) but nevertheless, might be a good idea in general! –  Aug 05 '15 at 20:41
  • Could the problem be in reverse? If stack is `0xFFFF`, might some computations be overflowing? (wrapping -around) given 16-bit `unsigned`? If so, use 32-bit unsigned as in `1UL*hbreak + size +0x40 > stack` - just an idea. – chux - Reinstate Monica Aug 05 '15 at 20:47
  • Hmmm ... it doesn't make sense to have a `size_t` bigger than 16 bits because addresses greater than `0xffff` are invalid in *real mode* (you'd have to use segment registers, of course `clang` doesn't support this). But still, pointers *are* 32bit, so the expression should be calculated in 32bit anyways. With some constructed test code, my `malloc()` failed as soon as reaching the stack area, as expected ... –  Aug 05 '15 at 21:00
  • 1) If isn't that `size_t` is 16-bits, my concern is that math of `hbreak + size +0x40` has a 17-bit answer. 2): Rather than use with the heap pointer, create a global `mem[N]` where N is about as large as reasonable/possible leaving enough stack for the code. Use `mem` for your memory pool. Effectively this is about the same as using the heap but with less `asm` tricks`. 3) In real mode and .COM, thought all pointers are 16-bit (same code, data,stack segment). – chux - Reinstate Monica Aug 05 '15 at 21:10
  • @chux thanks for your ideas. To 1) As far as I understand, this shouldn't matter, because I use the whole (32bit) `esp` register to compare with. Could there be "junk" in the upper 16 bits? :o 2) I didn't initially try a statically allocated memory pool because I wanted to stay as generic as possible (so, not pre-determining the size of the heap) ... but I'll give it a try and see whether it behaves differently. –  Aug 05 '15 at 21:22

1 Answers1

0

If this is real mode DOS, I'm not sure about the upper bits of esp. As for malloc(), use the memory between ss:sp and 0xa000:0000, the memory between the top of the stack and the 640k boundary . I don't recall if MS-DOS allocates all of the 640k region for a .COM program or not. There are two DOS calls, INT 21H, ah = 04Ah releases memory, ah = 048H allocates memory, but I don't recall if these are for .COM or .EXE programs.

Pawan
  • 1,537
  • 1
  • 15
  • 19
rcgldr
  • 27,407
  • 3
  • 36
  • 61
  • Wait, for `.COM`, `cs` == `ds` == `es` and using `clang`, I have no way to access memory in any other segment (except through inline assembly of course), so I'm using *just* 64KB here. Why allocate memory from DOS that you can't use in C anyways? The heap starts wherever sections placed by the linker end (right now around `[ds]:8300`) and the stack starts (in the same segment) from `[ds]:ffff`. –  Aug 06 '15 at 05:43
  • With Microsoft compiler, you can use __far to access the memory outside the current data segment. For example char __far *fpchar; – rcgldr Aug 06 '15 at 07:08