0

I'm trying to get the second character of a string (eg e in Test). Using emu8086 to compile.

When I do:

str db 'Test$'
...
mov si, 1       ; get second character of str
mov bl, str[si] ; store the second character

mov ah, 2       ; display the stored character
mov dl, bl
int 21h

The output is e.

But when I do:

str db 25
    db ?
    db 25 dup (?)
...
mov ah, 0ah         ; accept a string
lea dx, str         ; store input in variable str
int 21h

mov si, 1           ; get second character of str (??)
mov bl, str[si]     ; store the second character

mov ah, 2           ; display the stored character
mov dl, bl
int 21h

I get .

When I change the second snippet's "get second character of str" portion to this:

mov si, 3               ; get second character of str (why is it '3' instead of '1'?)
mov bl, str[si]         ; store the second character

I get e.

I don't understand. While it works in the first snippet, why, in the second snippet, do I have set SI to 3 instead of 1, if I'm trying to reference the second character of the string? Or is the method I'm using misled?

lefrost
  • 461
  • 5
  • 17
  • What do lines `str db 25`, `db ?`, `db 25 dup (?)` do? –  Sep 03 '18 at 09:53
  • 1
    @Ivan at least how I understand it, those lines are to fulfill the requirements of calling service `0ah` to accept a string and storing it in `str`. `str db 25` defines the maximum number of characters allowed to be entered (here it's 25), `db ?` is the number of characters entered by the user (defined as unknown), and `db 25 dup (?)` stores the characters that are accepted as input (defined as 25 unknown characters). https://stackoverflow.com/a/29517960/8919391 – lefrost Sep 03 '18 at 10:03
  • I think you should check http://spike.scu.edu.au/~barry/interrupts.html#ah0a for detailed information. You might find your answer there. –  Sep 03 '18 at 10:06
  • The second byte of that structure is also "input" argument on some DOS versions, telling DOS how many bytes in the raw buffer are actually valid for "recall input", so instead of `db ?` it is more safe to use `db 0` to let DOS know, that the current buffer is completely undefined reserved memory. But it's very common for all examples to use that "db ?" for second value in that 0A structure, so don't worry about it, it will assemble as `db 0` anyway. This is JFYI comment, trying also to give you idea, how subtle details may affect functionality of asm code, and how precise you must be... – Ped7g Sep 03 '18 at 10:34

1 Answers1

0

str[si] is not some kind of type/array access, but it will translate into instruction memory operand like [si+1234], where "1234" is offset, where the label str points to in memory.

And in your second example the str label points at byte with value 25 (max length of buffer), then str+1 points at returned input length byte (that's the value you get on output, if you try to print it out as character), and str+2 points at first character of user input. So to get second character you must use str+3 memory address.

The memory is addressable by bytes, so you either have to be aware of byte-size of all elements, or use more labels, like:

str_int_0a:            ; label to the beginning of structure for "0a" DOS service
     db 25
     db ?
str:                   ; label to the beginning of raw input buffer (first char)
     db 25 dup (?)

Then in the code you use the correct label depending on what you want to do:

...
mov ah, 0ah         ; accept a string
lea dx, str_int_0a  ; store input in memory at address str
int 21h

mov si, 1           ; index of second character of str
mov bl, str[si]     ; load the second character

mov ah, 2           ; display the stored character
mov dl, bl
int 21h
...

You should use some debugger and observe values in memory, and registers, and assembled instructions to get the better feel for how these work inside the CPU, how segment:offset addressing is used to access memory in 16b real mode of x86, etc...

Ped7g
  • 16,236
  • 3
  • 26
  • 63