4

The problem is when I build a 32-bit application.exe I get an application with 16-bit machine code.

Here is the code (taken from a book):

        .386
    .model flat
    .const
URL db  "http://www.lionking.org/`cubbi/", 0
    .code

_start:
    xor ebx, ebx
    push ebx
    push ebx
    push ebx
    push offset URL
    push ebx
    push ebx
;        call ShellExecute 
    push ebx
;        call ExitProcess

end     _start

To build the application I write in console

  • ml winurl.asm (I tried ml /c winurl.asm but there is no other result)
  • link winurl.obj

Then I have an EXEcutable file with 16-bit machine code:

PU = ?86, Uirtual 8086 Mode, Id/Step = 0F62, A20 enabled
09E4:0000 33DB   XOR    BX,BX
09E4:0002 53     PUSH   BX
09E4:0003 53     PUSH   BX
09E4:0004 53     PUSH   BX
09E4:0005 680000 PUSH   0000h
09E4:0008 0000   ADD    [BX+SI],AL
09E4:000A 53     PUSH   BX
09E4:000B 53     PUSH   BX
09E4:000C 53     PUSH   BX
09E4:000D 0000   ADD    [BX+SI],AL
09E4:000F 006874 ADD    [BX+SI+74h],CH
09E4:0012 7470   JZ Short 0084

I don't need a properly working code. I just want to assembly an application with 32-bit code or I want to understand what I'm doing wrong.

Thank you for paying attention.

Alexey Frunze
  • 61,140
  • 12
  • 83
  • 180
Artem Bulatov
  • 123
  • 3
  • 7
  • 1
    Are you sure it's not just your disassembler that's misinterpreting the executable? What if you open the file in e.g. PE Explorer? – Michael Feb 21 '13 at 16:52
  • @Michael, I'd already disassembled my other program that worked properly with the same assembler and debugger (masm611 and Debug32, respectively). There is incorrect machine code. This is the problem. – Artem Bulatov Feb 21 '13 at 17:19
  • No, the machine code for `push bx` and `push ebx` is exactly the same. The difference is the expectation of what kind of code segment it is going to be loaded into. – Bo Persson Feb 21 '13 at 17:31
  • 2
    The assembled code *is* correct (it's 32-bit), it's your disassembler/simulator that tries to interpret it as 16-bit code, for whatever reason. – Igor Skochinsky Feb 21 '13 at 17:37
  • @BoPersson, both IDA and Debug32 interpret the program as 16-bit. To prove this I've opened 32-bit application and they were interpreted correctly (as 32-bit). In the listing above (in the question) the machine code for `push bx` is 53. The `push ebx` code is 6653. – Artem Bulatov Feb 21 '13 at 18:00
  • 5
    The code for `push bx` is 53 in 16-bit mode. The code for `push ebx` is 53 in 32-bit mode. So you can't use that to tell the difference. – Bo Persson Feb 21 '13 at 18:05
  • @IgorSkochinsky, no, the disassembler I use recognizes instructions for i386. _ **I'd not that code** CSEG segment .386 Start: CSEG ends end Start **works perfectly as 32-bit (excluding unset data and stack registers)** – Artem Bulatov Feb 21 '13 at 18:15
  • It seems to me that one way to detect heuristically from a x86 16-bit disassembly that it's not 16-bit code is the relatively high number of `add [bx+si],al` instructions (encoded as `00 00`). As 32-bit and 64-bit instructions have longer immediate operands, in 16-bit disassembly many times these longer operands with zero bytes get disassembled as `add [bx+si],al`, as can be seen here and in [my answer to Disassembling file that contain big data or is compressed](http://stackoverflow.com/questions/14470900/disassembling-file-that-contain-big-data-or-is-compressed/14471274#14471274). – nrz Feb 21 '13 at 19:33
  • @BoPersson, you're right. I've specified 32-bit unassembling and then code 53 has been interpreted as instruction `push ebx`. In addition, 6653 has became `push bx`. – Artem Bulatov Feb 22 '13 at 18:25

2 Answers2

4

Unless you tell the disassembler that your code is 16-bit (or 32-bit) and unless it can guess it somehow (e.g. based on the format of the executable, if any), the disassembler cannot know which one of the two it is.

I've taken the instruction bytes from your 16-bit disassembly and disassembled them as 32-bit code:

00000000:i33DB                           xor       ebx,ebx
00000002:i53                             push      ebx
00000003:i53                             push      ebx
00000004:i53                             push      ebx
00000005:i6800000000                     push      00000000
0000000A:i53                             push      ebx
0000000B:i53                             push      ebx
0000000C:i53                             push      ebx
0000000D:i0000                           add       [eax],al ; 0s between code & data
0000000F:i006874                         add       [eax+74],ch ; db 0,"ht"
00000012:i7470                           je ; db "tp"

This is the correct 32-bit machine code generated from your assembly source and you're not disasembling it correctly. Somehow you're disassmbling it as 16-bit, which is wrong.

Alexey Frunze
  • 61,140
  • 12
  • 83
  • 180
2

You should tell your disassembler that you've created 32-bit code. The proof:
From:

push offset URL

The disassembler showed this:

09E4:0005 680000 PUSH   0000h  
09E4:0008 0000   ADD    [BX+SI],AL

You see the second commands OP-code is 0000h, which is the first operation's parameter. The disassembler thinks that this it's 4 byte(I don't know what's exactly the difference in the parameters' size in the OP-code, it's sure that this is because of the 16 - 32 bit).

radl
  • 300
  • 3
  • 12