2

To bring back some memories I decided to sit down and code a little assembler game in VGA mode 13h - until the point I realized the visual output is flickering as hell.

At first I suspected it might be my clearscreen routine. Indeed by using a STOSW instead of writing a single byte to the video memory a time the flickering is less annoying but still present.

Digging some further I recalled I might have to wait for the vertical retrace and update my screen right after but that didn't make things much better.

So the final solution I'm aware of goes a little like this:

  • do all graphical operations - clearing the screen, setting pixels - on a separate memory region
  • wait for the vertical retrace
  • copy the memory over to the video memory

The theory is of course simple but I just can't figure out how to do my writes to the buffer and ultimately blit it into the video memory!

Here's a striped-down - though working - snippet of my code written for TASM:

VGA256      EQU 13h
TEXTMODE    EQU 3h
VIDEOMEMORY EQU 0a000h
RETRACE     EQU 3dah
.MODEL LARGE

.STACK 100h

.DATA 
spriteColor     DW ?
spriteOffset    DW ?
spriteWidth     DW ?
spriteHeight    DW ?
enemyOneA       DB 0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,1,1,0,1,1,0,1,1,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,1,1,0,1,0,0,0,0,0,0,0,0,0,1,0,1,0,0,1,0,1,0,0,0,0
spriteToDraw    DW ?
buffer          DB 64000 dup (0) ; HERE'S MY BUFFER

.CODE
Main:
    MOV     AX,@DATA;
    MOV     DS,AX

    MOV     AH,0
    MOV     AL,VGA256
    INT     10h
    CLI
MainLoop:
    MOV     DX,RETRACE
Vsync1:
    IN      AL,DX
    TEST    AL,8
    JZ      Vsync1
Vsync2:
    IN      AL,DX
    TEST    AL,8
    JNZ     Vsync2
        
    CALL    clearScreen
    CALL    updateSprites
    JMP     MainLoop
    mov     AH,1
    int     21h

    mov     AH,0
    mov     AL,TEXTMODE
    int     10h 

; program end

clearScreen PROC NEAR 
    MOV     BX,VIDEOMEMORY
    MOV     ES,BX
    XOR     DI,DI
    MOV     CX,320*200/2
    MOV     AL,12
    MOV     AH,AL
    REP     STOSW
    RET
clearScreen ENDP

drawSprite PROC NEAR
    MOV     DI,0
    MOV     CX,0
ForLoopA:
    PUSH    CX
    MOV     SI,CX
    MOV     CX,0
ForLoopB:
    MOV     BX,spriteToDraw
    MOV     AL,[BX+DI]

    CMP     AL,0
    JE      DontDraw

    MOV     BX,spriteColor
    MUL     BX

    PUSH    SI
    PUSH    DI
    PUSH    AX

    MOV     AX,SI
    MOV     BX,320
    MUL     BX
    MOV     BX,AX
    
    POP     AX
    POP     DI

    ADD     BX,CX
    ADD     BX,spriteOffset
    MOV     SI,BX

    MOV     BX,VIDEOMEMORY
    MOV     ES,BX
    MOV     ES:[SI],AL
    POP     SI
DontDraw:
    INC     DI
    INC     CX
       
    CMP     CX,spriteWidth
    JNE     ForLoopB
    POP     CX
    INC     CX
    CMP     CX,spriteHeight
    JNE     ForLoopA
    RET
drawSprite ENDP

updateSprites PROC NEAR
    MOV     spriteOffset,0
    MOV     spriteColor,15
    MOV     spriteWidth,16
    MOV     spriteHeight,8     
    MOV     spriteOffset,0
    MOV     spriteToDraw, OFFSET enemyOneA
    CALL    drawSprite
    RET
updateSprites ENDP

END Main
fuz
  • 88,405
  • 25
  • 200
  • 352
obscure
  • 11,916
  • 2
  • 17
  • 36
  • So... what is your question? – fuz Jan 15 '21 at 12:49
  • Well, how to write the data to the memory region 'buffer' in my snippet and blit that memory over to the video memory after the vertical retrace. – obscure Jan 15 '21 at 12:52
  • `rep movsb` or `w` should be very fast on a modern CPU even when the destination is video RAM (mapped WC), easily fast enough to run during a vblank. (And probably finish before the first line scans, although the actual requirement is just to be fast enough that scan-out doesn't catch up with your copying.) – Peter Cordes Jan 15 '21 at 13:47
  • There's also using the VGA registers to change the display page and the page mapped into the physical memory, IIRC. But I don't see my reference books for that, which probably means they're stuck in a box someplace... – 1201ProgramAlarm Jan 15 '21 at 18:20

1 Answers1

1

The first problem is that you're in real mode. This means that you're working with 64 KiB segments. For "320*200 with 256 colors" the buffer will need to be 64000 bytes; and if you try to have a single data segment containing everything you'll only have 1535 bytes left for things that aren't the buffer (sprites, global variables, etc). It's too restrictive (sooner or later you're going to want animated sprites, or a level/map/background scenery, or ...).

The next problem is that you don't want 64000 bytes of zeroes in the executable file. Normally you'd use a ".bss section" to avoid that (a special area for "assumed initialized to zero" or "assumed uninitialized " data that isn't in the executable file).

To solve both of these problems; I'd allocate memory for the buffer (e.g. maybe using the int 0x21, ah = 0x48 DOS function) and have a special buffer segment. In this case blitting the buffer to video memory might look like:

    push es
    push ds
    mov ax,VIDEO_MEMORY_SEGMENT
    mov bx,[bufferSegment]
    mov es,ax
    mov ds,bx
    mov cx,320*200/2
    cld
    xor si,si               ;ds:si = bufferSegment:0 = address of buffer
    xor di,di               ;es:di = VIDEO_MEMORY_SEGMENT:0 = address of video memory
    rep movsw
    pop ds
    pop es
    ret

Note 1: It'd be better/faster to use mov cx,320*200/4 and rep movsd to copy 4 bytes at a time, but this would require a 32-bit CPU (won't work for 80286 or later). If supported by CPU, 32-bit instructions work fine in 16-bit code (it's just an operand size prefix to change the default size and you do not need to switch use protected mode).

Note 2: The cld (set clear the "direction flag") may be unnecessary. Typically you clear the direction flag once at the start of your program (or rely on the flag being "guaranteed clear by OS at program start") so that you don't need to make sure it's clear every time you use a string instruction (e.g. like rep movsw).

For writing to the buffer, all your code would remain the same except that you'd set es to buffer_segment instead of setting es to VIDEO_MEMORY_SEGMENT.

Note 3: Rather than loading es with the same value in multiple places (in clearScreen, in the middle of a loop in drawSprite(!), etc) it'd be better to set es once during program initialization and save/restore it when you need to use es for something else (in the blitting function); so that you can avoid the (relatively expensive) segment register loads (e.g. mov es,bx) in all of the drawing code.

Also; if you do end up wanting a background image (generated from level/map data, or...) you could use a third "background buffer". This would be mostly the same - allocate another 64000 bytes for the background (and have a background_segment), then draw the background into the buffer once (when you load the level or general the map or ..); then copy the "already drawn" background data from the background buffer to the main buffer instead of clearing the buffer, and draw your sprites on it, and then blit the buffer to video.

Brendan
  • 35,656
  • 2
  • 39
  • 66
  • 1
    Recent CPUs (IvyBridge and later) have fast `rep movsb`. And even `rep movsw` is not much if any slower on most Intel/AMD CPUs from this century. So using `movsd` is probably only a big deal if you care about older CPUs (like probably before P6-family), which might be the case if you're writing retro 16-bit code. – Peter Cordes Jan 15 '21 at 15:18
  • Thanks for taking the time @Brendan. I tried what you've suggested for simplicity though I kept the buffer inside the **.DATA** section (just for testing) and initialized it like `DW 32000 dup (0)`. Afterwards I tried to modify the clearScreen function `MOV BX, OFFSET buffer` and `MOV ES, BX` but this change makes the program freeze at the `REP STOSW` instruction. What could be the cause? – obscure Jan 16 '21 at 13:49
  • @obscureL You can't (easily) use an "offset within a segment" as a segment. If `buffer` is at offset 0x1234 within your data segment, then loading 0x1234 into `es` and doing the `rep stosw` will trash whatever happens to be in RAM that has nothing to do with your code (possibly overwriting DOS's code or data, possibly overwriting your own code, ...). – Brendan Jan 16 '21 at 23:55
  • @obscure: More specifically, to use `buffer` as a segment you'd have to do a `segment = ds + (buffer >> 4)` calculation (which will only work if `buffer` is aligned on a 16-byte boundary, and if the addition doesn't case an overflow). – Brendan Jan 16 '21 at 23:57