-1

!!!SOLVED!!!: Follow up question on stackoverflow: how can i get the dos exe build from segments in multiple asm files binary identical to the single asm file version

UPDATE 0: solved the external problem but i've got a problem with setting the data segment, see UPDATES

(also posted this question on: https://www.reddit.com/r/asm/comments/d33mpn/masm_x86_16bit_dos_exe_how_to_split_single_asm/)

im working on an tool that helps binary identical reassembling of DOS 16bit, IDA Pro asm output, for that im using an "well defined" input exe based on this asm-code (its my playground for serveral hard to reverse features, strongly cutted for this question)

[all_in_one.asm]

.model medium
.386

seg000 segment use16
text db 'Hello World!',0ah,0dh,'$'
seg000 ends

seg001 segment use16
;assume cs:seg001 ; for TASM
start proc
  mov ax,seg seg000
  mov ds,ax
  push ax
  pop ax
  call print
  mov ax,4c00h
  int 21h
start endp

seg001 ends

seg002 segment use16
;assume cs:seg002 ; for TASM
print proc far
  mov dx,offset text
  mov ah,09h
  int 21h
  retf
print endp

seg002 ends

seg003 segment stack use16
  db 256 dup (?)
seg003 ends

end start

im not using .STACK or other features of MASM intentional because normal reversed results of IDA are full of special cases (multiple stacks, segment ordering etc.) that i want to replicate 1:1, im not interested in how to write the same shorter in MASM, just the exact result is relevant for me

i can build the above code using

[build.cmd]

:: MASM x86, 14.16.27032.1 from VS2017 Community (current UASM, TASM also work)
ml.exe /c /omf all_in_one.asm
:: UniLink v1.11 [beta] (build 11.23), ftp://ftp.styx.cabel.net/pub/UniLink/
ulink.exe all_in_one.obj
:: Microsoft Linker 16bit, Version 5.60.339 Dec 5 1994
link.exe all_in_one.obj,,,,,

and create a fully working 16 bit dos executable (tested with IDA and Dosbox) with both linkers (should also work with TLink, WLink, Optlink and Alink and all other available linkers that support 16bit dos executables)

everything is fine... but

reversed executables tends to have huge segments and sometimes 20 or even over 100 of them, so my tool split the segments into single files, it easier to get the single segment files to be binary comptible with the original if not everything is clutched together

so i splitted the file into segment files (and inc-files for forward declaration, maybe there is a better way with MASM for that - the forward declared data should be type safe as possible, i've read about EXTERNDEF and others but i have no idea what are the pros/cons of using it)

[seg000.asm]

.model medium
.386

seg000 segment use16
  text db 'Hello World!',0ah,0dh,'$'
seg000 ends

end

[seg000.inc]

seg000 segment use16
  extern text:far
seg000 ends

[seg001.asm]

.model medium
.386

; forward declare segments
include seg000.inc
include seg002.inc

seg001 segment use16

start proc
  mov ax,seg seg000
  mov ds,ax
  push ax
  pop ax
  call print
  mov ax,4c00h
  int 21h
start endp

seg001 ends

end start

[seg002.asm]

.model medium
.386

; forward declare segments
include seg000.inc

seg002 segment use16

print proc far
  mov dx,offset text
  mov ah,09h
  int 21h
  retf
print endp

seg002 ends

end

[seg002.inc]

seg002 segment use16
  extrn print:far
seg002 ends

[seg003.asm]

.model medium
.386

seg003 segment stack use16
  db 256 dup (?)
seg003 ends

end

and try to build it with

[build_multiple.cmd]

ml.exe /c /omf seg000.asm
ml.exe /c /omf seg001.asm
ml.exe /c /omf seg002.asm
ml.exe /c /omf seg003.asm
ulink.exe seg000.obj seg001.obj seg002.obj seg003.obj
link.exe seg000.obj seg001.obj seg002.obj seg003.obj,,,,,

the assembling works for all asm files, but both linkers return errors

UniLink v1.11 [beta] (build 11.23)
Error: Unresolved external 'text' referenced from 'seg001.obj'

Microsoft (R) Segmented Executable Linker  Version 5.60.339 Dec  5 1994
Copyright (C) Microsoft Corp 1984-1993.  All rights reserved.

seg002.obj(seg002.asm) : error L2029: 'text' : unresolved external
seg001.obj(seg001.asm) : error L2029: 'text' : unresolved external

is my forward declaring of the segment content correct? any idea how to fix the unresolved external linker error?

UPDATE 1: unresolved external fixed with public

i change my data segment to:

seg000 segment use16
  public text ; <====
  text db 'Hello World!',0ah,0dh,'$'
seg000 ends

now it links but something seems to be not correct with my stack segment

UniLink v1.11 [beta] (build 11.23)
Error: SP value out of range
Error: SS value out of range

the Microsoft Linker does not print any error but the resulting exe prints just garbage

UPDATE 2:

the all_in_one and splitted-segments exe are binary(with header) identical except one byte

the mov ax,seg dseg in seg001 is encoded different

[all_in_one exe]

B8 (00) 10 --> mov ax, 0x1000 <-- 0x1000 is seg000(data)

[splitted exe]

B8 (01) 10 --> mov ax, 0x1001 <-- 0x1000 is seg001(code)

that means instead of seg000 splitted exe uses seg001 as the data segment (my first code segment) and then tries to print something from the code-segment until there is a '$' found in ram

why is that happening?

my idea so far is:

im including this seg000.inc (literaly empty segment only for forwarding and extern) right before the definition of seg001, could it be that the assembler thinks the included empty segment is exactly there and can be used, and because of its size of 0 its equal segment as seg001?

if that is the reason - how do i forward my procs and values then?

llm
  • 557
  • 3
  • 15
  • On the fast (unchecked): An EXTERN in one .obj file should correspond with a PUBLIC in another .obj file. – rkhb Sep 12 '19 at 09:17
  • If you have a new question don't edit your existing question to include it, instead post a new question. Once your question has been answered you're not allowed to change your question so that it would invalidate existing answers. – Ross Ridge Sep 12 '19 at 15:06

1 Answers1

1

The first problem is to be solved by setting a PUBLIC at the right place.

The second problem is caused by the .MODEL directive and incomplete SEGMENT declarations.

Build a file that includes all SEGMENT declarations and is included in all .ASM files:

segments.inc:

seg000 segment para public use16 
seg000 ends

seg001 segment para public use16
seg001 ends

seg002 segment para public use16
seg002 ends

seg003 segment use16 stack
seg003 ends

The other files look like these:

seg000.asm:

.386
include segments.inc

public text

seg000 segment
  text db 'Hello World!',0ah,0dh,'$'
seg000 ends

end

seg001.asm:

.386
include segments.inc

extern print:FAR

seg001 segment

start proc
  mov ax, seg000
  mov ds,ax
  push ax
  pop ax
  call print
  mov ax,4c00h
  int 21h
start endp

seg001 ends

end start

seg002.asm:

.386
include segments.inc

extern text:BYTE

seg002 segment

print proc far
  mov dx,offset text
  mov ah,09h
  int 21h
  retf
print endp

seg002 ends

end

seg003.asm:

.386
include segments.inc

seg003 segment
  db 256 dup (?)
seg003 ends

end

I used LINK16.EXE from the MASM32 package and build_multiple.cmd in WinXP:

@ECHO OFF

ml.exe /c /omf seg000.asm
ml.exe /c /omf seg001.asm
ml.exe /c /omf seg002.asm
ml.exe /c /omf seg003.asm

C:\masm32\bin\link16.exe /L seg000.obj seg001.obj seg002.obj seg003.obj;

Look at the .MAP file if all segments are in the right order.

rkhb
  • 14,159
  • 7
  • 32
  • 60
  • out of curiosity: does that also work if i got serveral text variables or proc with same name in different segments? – llm Sep 12 '19 at 13:39
  • it works but the all_in_one.exe is different, even in the exe header - my main goal is to be binary compatible with the combined version (that will come from IDA), i've removed the .model and corrected the segments in the all_in_one.asm but still changes in the resulting exe, i will check the map file for segment starts and sizes – llm Sep 12 '19 at 14:34
  • i've updated the question with the map information, small differences that breaks my binary compatibility betweem all_in_one.exe and multi.exe – llm Sep 12 '19 at 15:06
  • removing the .model medium directive results in a large call in the multi.exe, following in a crash, i've read that using .mode before .386 forces use16 – llm Sep 12 '19 at 15:42
  • @llm: You can produce nearly the same .exe files if you remove the class names ('DATA', 'CODE'). I've edited my answer accordingly. The multiple .exe is padded with zeroes at the end. I didn't find a solution. My solution works here with `link16.exe`. `ulink.exe` seems to have a bug. – rkhb Sep 12 '19 at 16:29
  • i've opend up a follow up question here: https://stackoverflow.com/questions/57911151/how-can-i-get-the-dos-exe-build-from-segments-in-multiple-asm-files-binary-ident – llm Sep 12 '19 at 16:38
  • "ulink.exe seems to have a bug" - fixed now – llm Sep 13 '19 at 11:28