10

There seems to be an undocumented constant eof in asm block context. This was tested using Delphi 7.

program TestEof;
{$APPTYPE CONSOLE}
var
  example : Integer;
begin
  asm
    mov example, eof
  end;
  writeln(example);
  readln;
end.

This prints out 14.

Where does that constant eof and it's value value $0E or 14 come from?


EDIT: this is the compilation result

...
call @InitExe
// mov example, eof
mov [example], $0000000e
// writeln(example)
mov eax, [$004040a4]
mov edx, [example]
call @Write0Long
call @WriteLn
call @_IOTest
// readln;
...
Egon
  • 1,705
  • 18
  • 32

1 Answers1

5

Eof is in fact a function defined in the System unit.

In the implementations of Delphi that I have at hand, Delphi 6 and XE2, Eof is implemented as an intrinsic routine that results in a call to one of the following functions, as appropriate:

function _EofFile(var f: TFileRec): Boolean;
function _EofText(var t: TTextRec): Boolean;

I have no idea why your assembler code is turned into mov [...],$0000000e. You point out in a comment that the System unit itself makes use of eof in asm code, for example in TextOpen. The same code in XE2 is now pure Pascal and searches for a value of $1A instead of $0E. This would very much appear to be an implementation detail. If you want to understand why this is so then I think you will need to reverse engineer the code in the System unit, or see if the engineers at Embarcadero will explain the implementation to you.

David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
  • This is actually used in `system.pas` as well. See `TextOpen` function. After `//if (f.Buffer[i] == eof)`. Line `CMP byte ptr [ESI].TTextRec.Buffer[EAX],eof`. – Egon Jan 02 '12 at 16:26
  • Strictly speaking, it is a built-in. The builtin calls the rtl helper which are the functions that you specify, but can also do generate code and call other functions (e.g. iotest). The EOF might be some internal sequence number for the build-in, to help streamline multiple built-ins that are similar (e.g. all IOCheck/IOTest builtins) but without compiler source that is guesswork. – Marco van de Voort Jan 02 '12 at 16:33
  • And in XE2 the equivalent code looks for `$1A`. This is clearly implementation detail in System.pas. What problem are you trying to solve or is this just out of curiosity. – David Heffernan Jan 02 '12 at 16:36
  • @Marco I actually think that eof in asm is different from eof in pascal. See Egon's first comment. But I do agree that it's all a bit academic unless you are the compiler vendor. – David Heffernan Jan 02 '12 at 16:38
  • @Marco When you say "built-in", is that different from "intrinsic" as per the documentation link that I provided in the answer? – David Heffernan Jan 02 '12 at 16:45
  • @DavidHeffernan if you look into the `TextOpen` function windows version, then after encountering this character the file will be truncated. In the linux version `cEOF` is used, which is `$1A`. – Egon Jan 02 '12 at 16:45
  • @egon In XE2 the Windows version uses `cEOF` too. Can I ask again, is this just curiosity? – David Heffernan Jan 02 '12 at 16:46
  • @DavidHeffernan mostly curiosity and because TextOpen truncated the file. So essentially I got the reason why it was truncated, but I couldn't figure out why it should compile to that value `$0E`, instead of `$1A`. – Egon Jan 02 '12 at 16:51
  • 1
    David, EOF (hex 1A, decimal 26) has meant EOF since the early days of DOS. It's the standard end of file character, and (ages ago - pre-DOS 5, IIRC) used to be required in text files to indicate where the file actually stopped. It's documented as EOF in any standard ASCII table. If you have GExperts, for instance, look at their ASCII chart.) If you use `copy con test.txt` from a command prompt in Win7, you can type a line of text and then hit F6 to insert an EOF char, the same way you used to do it with early DOS versions. – Ken White Jan 02 '12 at 17:34
  • @ken where does 0E come from? – David Heffernan Jan 02 '12 at 17:44
  • @David, I don't know. It's defined in the ASCII table as `SO`, or `Shift Out`. IIRC, a lot of the ASCII codes were derived from teletypes and terminals (which is where things like `CR` and `LF` came from, along with `STX` and `ETX`, 0x02 and 0x03 - 'start of text` and 'end of text' respectively). I can't see any reason it would show up in the OP's code, though. – Ken White Jan 02 '12 at 17:53
  • @Ken White: So [the Wikipedia ASCII table](http://en.wikipedia.org/wiki/ASCII#ASCII_control_characters) is *not* a 'standard ASCII table'? In this table, `#26` (`#$1A`) is called `SUB` (substitute). – Andreas Rejbrand Jan 02 '12 at 19:18
  • @Ken White: Still, you are right, it seems. Wikipedia says that `^Z` (that is, `#26`) is commonly used as the [EOF character](http://en.wikipedia.org/wiki/End-of-file). – Andreas Rejbrand Jan 02 '12 at 19:25
  • 1
    @Andreas That's how you terminate the interactive Python interpreter on Windows isn't it. Whoever closed an input stream with ^N? – David Heffernan Jan 02 '12 at 19:27
  • @Andreas, yo're right. I must have remembered wrong about the chart; I just looked, and in fact it isn't labeled EOF as I thought. The #26 being EOF in text files, though, I was certain was right. I remember it well. :) – Ken White Jan 02 '12 at 19:43