Your code is broken because you always fall through to save: MOV CH, AL
every iteration, so it can only work if the last capital is also the very last character of the whole input.
Single-step it with a debugger for a simple input like ABc*
to see how it goes wrong.
Also, you use loop
, which is like dec cx/jnz
. That makes no sense because there's no counter-based termination condition, and could potentially corrupt CH if CL was zero. You don't even initialize CX first! The loop
instruction is not the only way to loop; it's just a code-size peephole optimization you can use when it's convenient to use CX as a loop counter. Otherwise don't use it.
This is a simplified version of Sep's implementation, taking advantage of the fact that the input is guaranteed to be alphabetic, so we really can check for upper case as easily as c <= 'Z'
(after ruling out the '*'
terminator). We don't have to worry about inputs like 12ABcd7_
or spaces or newlines, which also have lower ASCII codes than the upper-case alphabetic range. Your cmp al,'Z'
/ ja
check was correct, it's just the code you were branching to that didn't have sane logic.
Even if you did want to strictly check c >= 'A' && c <= 'Z'
, that range check can be done with one branch using sub al,'A'
; cmp al,'Z'-'A'
; ja non_upper
instead of a pair of cmp/jcc branches. (That modifies the original, but if you save it in SI or something you could later restore it with lea ax, [si+'A']
)
You can also put a conditional branch at the bottom of the loop for both loops, instead of a jmp
at the bottom and an if() break
inside. Sep's code already did that for the first loop.
I agree with Sep that having 2 loops is easier than checking a flag every time you find a capital (to see if it's the first capital or not).
ORG 100h ; DOS .com is loaded with IP=100h, with CS=DS=ES=SS
; we don't actually do any absolute addressing so no real effect.
mov ah, 01h ; DOS.GetKeyboardCharacter
; AH=01 / int 21h doesn't modify AH so we only need this once
find_first_cap:
int 21h ; stdin -> AL
cmp al, '*' ; Found end of input marker ?
je Done ; if (c=='*') return; without print anything, we haven't found a capital yet
cmp al, 'Z'
ja find_first_cap
; fall through: AL <= 'Z' and we can assume it's a capital letter, not a digit or something.
mov dl, al ; For now it's the first
;mov dh, al ; AND the last capital
;mov ah, 01h ; DOS.GetKeyboardCharacter AH still = 01
;jmp loop2_entry ; we can let the first iteration set DH
Loop2: ; do {
cmp al, 'Z' ; assume all c <= 'Z' is a capital alphabetic character
ja loop2_entry
mov dh, al ; This is the latest capital
loop2_entry:
int 21h ; stdin -> AL
cmp al, '*'
jne Loop2 ; }while(c != '*');
Show: mov ah, 02h ; DOS.DisplayCharacter
int 21h ; AL -> stdout
mov dl, dh
; mov ah, 02h ; DOS.DisplayCharacter
int 21h ; AL -> stdout
Done: mov ax, 4C00h ; DOS.TerminateWithReturnCode
int 21h
At this point it's arguably not simpler, but is more optimized especially for code-size. That tends to happen when I write anything because that's the fun part. :P
Having a taken branch inside the loop for the non-capital case is arguably worse for performance. (In modern code for a P6-compatible CPU you'd probably use cmovbe esi, eax
instead of a conditional branch, because a conditional move is exactly what you want.)
Omitting the mov ah, XX
before an int 21h
because it's still set doesn't make your program more human-readable, but it is safe if you're careful to check the docs for each call to make sure they don't return anything in AH.