Zero out characters with even number of set bits, and reverse string

Question

I am not able to finish given task using DOS Debug:

Every input string symbol, that has even number of bits has to be changed to 0. And then string has to be reversed and printed to the screen.

a200
db 50

a260
db 'Enter string' 0d 0a '$'

a100
mov ah, 09
mov dx, 260
int 21
mov ah, 0a
mov dx, 200
int 21
mov ah, 02
mov dl, 0d
int 21
mov ah, 02
mov dl, 0a
int 21
xor cx, cx
mov bx, 201
mov cl, [bx]
int bx
mov dl, [bx]
inc bx
mov dl, [bx]
mov al, dl
mov ah, 0
clc
rcr al, 1
adc ah, 0

This is how far I was able to get. However, it is not finished. I am not sure if I am going to the right direction.

I have an idea to use perity flag to check if number of bits is even. However, I can't implement it.

*string symbol, that has even number of bits*? If *symbol* is a byte, then each symbol has exactly eight bits (even). But yes, parity flag represents the number of bits in a byte which are set to 1. `TEST AL,AL` `JPO somewhere`. — vitsoft, Oct 12 '22 at 16:50
`int bx` won't assemble. Typo for `inc bx` I assume. (You could have used `[bx+1]` instead of multiple increments). You could get the reversal part working separately from the conditional zeroing. — Peter Cordes, Oct 12 '22 at 22:56
If you could use AVX-512BITALG (Ice Lake), the fun way to do this would be `vpopcntb ymm1, ymm0` (https://www.felixcloutier.com/x86/vpopcnt) / `vptestmb k1, ymm1, set1_epi8(1)` (https://www.felixcloutier.com/x86/vptestmb:vptestmw:vptestmd:vptestmq) to get a mask of elements with odd parity (to *not* be zeroed). (And `vpermb` can reverse in chunks of 16, 32, or 64 bytes). Curious if there are other more direct ways to get the parity, like possibly with [`gf2p8affineqb`](https://www.felixcloutier.com/x86/gf2p8affineqb)? Its pseudo-code involves a parity computation. — Peter Cordes, Oct 12 '22 at 23:00
DOS Debug.exe of course won't know about AVX-512 or GFNI instructions, although the non-AVX form of `GF2P8AFFINEQB` could be usable in 16-bit real mode. (Unlike VEX and EVEX prefixes.) — Peter Cordes, Oct 12 '22 at 23:05

score 3 · Accepted Answer · edited Nov 07 '22 at 06:10

int bx
mov dl, [bx]
inc bx
mov dl, [bx]
mov al, dl
mov ah, 0
clc
rcr al, 1
adc ah, 0

Up to reading the length of the user inputted string, your code looks fine, but then it starts to look like you've just thrown some random stuff together!
Your idea to use the parity flag is ok. When a byte has 0, 2, 4, 6, or 8 bits that are set (to 1), the PF will be set. When a byte has 1, 3, 5, or 7 bits that are set (to 1), the PF will be clear.
The x86 instruction set has 4 instructions that allow you to conditionally jump based on the state of the parity flag:
The jp and jpe instructions share the same opcode 7Ah.

jp jump if parity (PF=1)
jpe jump if parity even (PF=1)

The jnp and jpo instructions share the same opcode 7Bh.

jnp jump if no parity (PF=0)
jpo jump if parity odd (PF=0)

There are many instructions that modify the parity flag. The below code uses the cmp instruction. In your program you want to zero in case of parity, which is equivalent to skip the zeroing in case of no parity. That's why the code uses the jnp instruction.

      ...
011F  mov  cl, [bx]

0121  mov  bx, 202            ; Where the string starts
0124  cmp  byte ptr [bx], 0   ; Have the parity flag defined
0127  jnp  012C               ; Skip in case of no parity
0129  mov  byte ptr [bx], 0   ; Zero in case of parity
012C  inc  bx                 ; Go to next character
012D  loop 0124               ; Until all characters have been processed

At the end of the above loop, the BX register points right behind the string. This is a good moment to apply the $-terminator that you will need in order to print your final result.

012F  mov  byte ptr [bx], "$"

The task of reversing the string requires maintaining two pointers that move towards each other while swapping bytes:

0132  dec  bx          ; Have BX point at the last byte in the string
0133  mov  si, 202     ; Have SI point at the first byte in the string
0136  mov  al, [bx]    ; Read both bytes
0138  mov  dl, [si]
013A  mov  [bx], dl    ; Swap both bytes
013C  mov  [si], al
013E  inc  si          ; Move towards the middle of the string
013F  dec  bx
0140  cmp  si, bx      ; Stop once the pointers cross
0142  jb   0136

EDIT 1

This edit deals with the OP's effort to implement the suggested solution.
This is the OP's code that reportedly runs into an infinite loop:

a200
db 50

a300
db 'Enter string' 0d 0a '$'

a100
mov ah, 09
mov dx, 300
int 21
mov ah, 0a
mov dx, 200
int 21
mov ah, 02
mov dl, 0d
int 21
mov ah, 02
mov dl, 0a
int 21
mov bx, 202

a200
cmp byte ptr [bx], 0
jnp 250 
mov byte ptr [bx], 30

a250
inc bx
loop 200
mov byte ptr [bx], '$'
dec bx
mov si, 202

a400
mov al, [bx]
mov dl, [si]
mov [bx], dl
mov [si], al
inc si
dec bx
cmp si, bx
jb 400
mov dx, 202
mov ah, 09
int 21
mov ah, 4c
int 21

n lab1.com
r cx
800
w
q

The reasons why this fails are

Right after outputting the carriage return and linefeed, your original code had instructions to load CX with the length of the string. You have deleted these lines and as a consequence the loop could now be running a very long time.
With a200 cmp byte ptr [bx], 0 you are overwriting the input buffer that you had setup with a200 db 50. Keep data and code apart.
All the instructions in your program must stay close together. Each time you give another 'assemble' command like a200, a250, and a400 you are leaving holes in the program. The CPU will try to execute the bytes that happen to exist in these holes, but since they are not instructions there's a high chance this will fail miserably. Look closely at the code I wrote in my answer. Those numbers (011F, 012F, 0132, ...) in the leftmost column are the addresses where the code belongs so the code forms one contiguous block.

The whole code with corrections is:

a200
db 50 00

a300
db 'Enter string' 0D 0A '$'

a100
mov  ah, 09                    This is at address 0100
mov  dx, 300
int  21
mov  ah, 0A
mov  dx, 200
int  21
mov  ah, 02
mov  dl, 0D
int  21
mov  ah, 02
mov  dl, 0A
int  21
xor  cx, cx
mov  bx, 201
mov  cl, [bx]
mov  bx, 202
cmp  byte ptr [bx], 0          This is at address 0124
jnp  012C
mov  byte ptr [bx], 30
inc  bx                        This is at address 012C
loop 0124
mov  byte ptr [bx], '$'
dec  bx
mov  si, 202
mov  al, [bx]                  This is at address 0136
mov  dl, [si]
mov  [bx], dl
mov  [si], al
inc  si
dec  bx
cmp  si, bx
jb   0136
mov  dx, 202
mov  ah, 09
int  21
mov  ah, 4C
int  21

n lab1.com
r cx
800
w
q

The w command expects the file size in BX:CX. I assume you have checked BX=0 ?

EDIT 2

This edit deals with the OP's effort to refine the program so as to exempt the numerical digits from conversion.
This is the OP's code that reportedly crashes:

a200
db 50 00

a300
db 'Enter string' 0D 0A '$'

a100
mov  ah, 09
mov  dx, 300
int  21
mov  ah, 0A
mov  dx, 200
int  21
mov  ah, 02
mov  dl, 0D
int  21
mov  ah, 02
mov  dl, 0A
int  21
xor  cx, cx
mov  bx, 201
mov  cl, [bx]
mov  bx, 202
mov  dl, byte ptr [bx]     ; Put numerical representation of character into dl register
cmp  dl, 39                ; Compare dl value with 39 (char '9')
jg   0132                  ; If dl value is greater, it is not a digit, jump to parity flag's definition
cmp  dl, 30                ; Compare dl value with 30 (char '0')
jge  013A                  ; If dl value is greater or equal then it is a digit, jump to the line where we increment bx
cmp  byte ptr [bx], 0
jnp  013A
mov  byte ptr [bx], 30
inc  bx
loop 0124
mov  byte ptr [bx], '$'
dec  bx
mov  si, 202
mov  al, [bx]
mov  dl, [si]
mov  [bx], dl
mov  [si], al
inc  si
dec  bx
cmp  si, bx
jb   0144
mov  dx, 202
mov  ah, 09
int  21
mov  ah, 4C
int  21

n lab1.com
r cx
800
w
q

The problem today is that the jump targets are off by 2.
You have added 5 lines of new code to your program. Together they take up 12 bytes, but the program seems to think it's 14 bytes.

0124: mov  dl, byte ptr [bx]   2 bytes
0126: cmp  dl, 39              3 bytes
0129: jg   0132                2 bytes
012B: cmp  dl, 30              3 bytes
012E: jge  013A                2 bytes
0130:                         --------
                              12 bytes

The whole code with corrections is:

a200
db 50 00

a300
db 'Enter string' 0D 0A '$'

a100
mov  ah, 09
mov  dx, 300
int  21
mov  ah, 0A
mov  dx, 200
int  21
mov  ah, 02
mov  dl, 0D
int  21
mov  ah, 02
mov  dl, 0A
int  21
xor  cx, cx
mov  bx, 201
mov  cl, [bx]
mov  bx, 202

0124: mov  dl, [bx]
0126: cmp  dl, 39
0129: ja   0130                  ERR jg   0132
012B: cmp  dl, 30
012E: jae  0138                  ERR jge  013A
0130: cmp  byte ptr [bx], 0
0133: jnp  0138                  ERR jnp  013A
0135: mov  byte ptr [bx], 30
0138: inc  bx
0139: loop 0124
013B: mov  byte ptr [bx], '$'
013E: dec  bx
013F: mov  si, 202
0142: mov  al, [bx]
0144: mov  dl, [si]
0146: mov  [bx], dl
0148: mov  [si], al
014A: inc  si
014B: dec  bx
014C: cmp  si, bx
014E: jb   0142                  ERR jb   0144

mov  dx, 202
mov  ah, 09
int  21
mov  ah, 4C
int  21

n lab1.com
r cx
800
w
q

Tip1: ASCII codes are unsigned numbers. Therefore you should use the unsigned conditional branch instructions. I've used ja JumpIfAbove and jae JumpIfAboveOrEqual instead of jg JumpIfGreater and jge JumpIfGreaterOrEqual.
Tip2: If you would use AL instead of DL in those few new lines then the program would shorten by another 2 bytes. You could do this as an exercise, but remember that the branch targets would have to change accordingly, again!

0124: mov  al, [bx]   2 bytes
0126: cmp  al, 39     2 bytes
0128: ja   012E       2 bytes
012A: cmp  al, 30     2 bytes
012C: jae  0136       2 bytes
012E:                --------
                     10 bytes

For human readers, I'd suggest using `jnpe` or `jpo` so they don't have to check the manual to remind themselves whether PF=1 means the value had even or odd parity. i.e. the semantic meaning you want involves the odd vs. even parity of the original value, not the detail of whether the bit in FLAGS is 0 or 1. (By contrast, after `comisd` or `fcomi`, you *would* want `jp` to jump if the comparison was unordered, since the 0/1 status of the flag itself is the semantic meaning you want.) — Peter Cordes, Oct 17 '22 at 05:56
@PeterCordes your explanation is perfect. I dedicated some time, tried to implement this myself. However, my program does not work properly. It seems that it runs infinite loop, does not reach output and exit. Here is my code - [assembly code](https://paste.debian.net/plain/1257753) — user10203585, Oct 20 '22 at 16:44
@user10203585 I reviewed your newest code and have added the full program with corrections to my answer. — Sep Roland, Oct 20 '22 at 22:23
@SepRoland One more thing. For practice I tried to update the code so that, if entered string's character is digit, from 0 to 9 inclusive [0;9], the program does not change it to zero, no matter the parity flag’s condition. Again, the program does not work properly. Now it crashes when I put in digits as the input and zeros out all characters with any other input that does not contain digits. Here is updated code - [update](https://paste.debian.net/plain/1257933) — user10203585, Oct 22 '22 at 10:57
@user10203585 I reviewed your updated code and have added the full program with corrections to my answer. Nice work! — Sep Roland, Oct 22 '22 at 18:55
@user10203585: DOS `debug.exe` seems like a big waste of time the way it makes you hard-code jump offsets or target addresses. Use an assembler, they do that for you with symbolic labels for jump targets. For example, NASM is a good one. — Peter Cordes, Oct 22 '22 at 18:59

Zero out characters with even number of set bits, and reverse string

1 Answers1