3

I'am using this component http://sourceforge.net/projects/tponguard/ and now I need to compile in 64bit. I'm stuck in this assembly.

It was like this:

  push esi
  push edi

  mov  esi, eax         //esi = Mem1
  mov  edi, edx         //edi = Mem2

  push ecx              //save byte count
  shr  ecx, 2           //convert to dwords
  jz   @Continue

  cld
@Loop1:                 //xor dwords at a time
  mov  eax, [edi]
  xor  [esi], eax
  add  esi, 4
  add  edi, 4
  dec  ecx
  jnz  @Loop1

@Continue:              //handle remaining bytes (3 or less)
  pop  ecx
  and  ecx, 3
  jz   @Done

@Loop2:                 //xor remaining bytes
  mov  al, [edi]
  xor  [esi], al
  inc  esi
  inc  edi
  dec  ecx
  jnz  @Loop2

@Done:
  pop  edi
  pop  esi

And I changed to this:

  push rsi
  push rdi

  mov  rsi, rax         //esi = Mem1
  mov  rdi, rdx         //edi = Mem2

  push rcx              //save byte count
  shr  rcx, 2           //convert to dwords
  jz   @Continue

  cld
@Loop1:                 //xor dwords at a time
  mov  rax, [rdi]
  xor  [rsi], rax
  add  rsi, 4
  add  rdi, 4
  dec  rcx
  jnz  @Loop1

@Continue:              //handle remaining bytes (3 or less)
  pop  rcx
  and  rcx, 3
  jz   @Done

@Loop2:                 //xor remaining bytes
  mov  al, [rdi]
  xor  [rsi], al
  inc  rsi
  inc  rdi
  dec  rcx
  jnz  @Loop2

@Done:
  pop  rdi
  pop  rsi

But now I got an Access Violation in xor [rsi], rax

PhiS
  • 4,540
  • 25
  • 35
  • 4
    Calling conventions changed in 64-bit, so the input pointer arguments aren't in RAX and RDX. Furthermore, it makes little sense to use a function for applying an operation to a sequence of dwords when the natural size of the CPU is qword now. Consider writing the function in Delphi instead of sticking with assembler. – Rob Kennedy Jun 10 '13 at 12:36
  • I'm with Rob. Port it to Pascal and let the compiler worry about the detail. Port in 32 bit first so that you can test comprehensively with what is known to be good. Then you can simply compile that for 64 bit. – David Heffernan Jun 10 '13 at 12:56
  • Can you help me with this? I don't know what this function does. – user2470881 Jun 10 '13 at 12:56
  • 1
    It makes xor operation for two byte buffers a[i] := a[i] xor b[i]. It would better to show function declaration – MBo Jun 10 '13 at 13:07

1 Answers1

5

The function you are looking at is

procedure XorMem(var Mem1; const Mem2; Count : Cardinal); register;

from the ogutil unit.

Personally I would not bother converting this to x64 assembler. There are quite a few tricky details that you need to get right in order to do so. It makes more sense to me to port to Pascal and let the compiler deal with the details. The simplest most naive translation looks like this:

procedure XorMem(var Mem1; const Mem2; Count: Cardinal);
var
  p1, p2: PByte;
begin
  p1 := PByte(@Mem1);
  p2 := PByte(@Mem2);
  while Count>0 do
  begin
    p1^ := p1^ xor p2^;
    inc(p1);
    inc(p2);
    dec(Count);
  end;
end;

If this is performance critical then you'd want to unroll the loop a little to operate on large operands. Say 32 bit operands on x86 and 64 bit operands on x64.

A version that operated on 32 bit operands might look like this:

procedure XorMem(var Mem1; const Mem2; Count: Cardinal);
var
  p1, p2: PByte;
begin
  p1 := PByte(@Mem1);
  p2 := PByte(@Mem2);
  while Count>3 do
  begin
    PCardinal(p1)^ := PCardinal(p1)^ xor PCardinal(p2)^;
    inc(p1, 4);
    inc(p2, 4);
    dec(Count, 4);
  end;
  while Count>0 do
  begin
    p1^ := p1^ xor p2^;
    inc(p1);
    inc(p2);
    dec(Count);
  end;
end;

Actually, you can easily enough write a version that automatically uses 32 or 64 bit operands as determined by the compilation target. The trick is to use the NativeUInt type which is machine word size.

procedure XorMem(var Mem1; const Mem2; Count: Cardinal);
var
  p1, p2: PByte;
begin
  p1 := PByte(@Mem1);
  p2 := PByte(@Mem2);
  while Count>SizeOf(NativeUInt)-1 do
  begin
    PNativeUInt(p1)^ := PNativeUInt(p1)^ xor PNativeUInt(p2)^;
    inc(p1, SizeOf(NativeUInt));
    inc(p2, SizeOf(NativeUInt));
    dec(Count, SizeOf(NativeUInt));
  end;
  while Count>0 do
  begin
    p1^ := p1^ xor p2^;
    inc(p1);
    inc(p2);
    dec(Count);
  end;
end;

This final version is pretty efficient when compiled with optimisations enabled. I would not look beyond that final Pascal version.

David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
  • The 3rd worked perfectly on 32bits, but not on 64bits. I passed this in Mem1 (229, 143, 132, 214, 146, 201, 164, 216, 26, 250, 111, 141, 171, 252, 223, 180), in 32bits it changes to (177, 202, 215, 130, 146, 201, 164, 216, 26, 250, 111, 141, 171, 252, 223, 180) and in 64 to (177, 202, 215, 130, 146, 201, 164, 216, 58, 16, 125, 141, 171, 252, 223, 180) – user2470881 Jun 10 '13 at 13:45
  • XorMem(Key, Modifier, Min(SizeOf(Modifier), KeySize)); (Modifier = 'Teste') – user2470881 Jun 10 '13 at 13:48
  • What you report in that first comment is not correct. To get the output you claim in 32 bit you must have `Mem2` equal to: `(84, 69, 83, 84, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)`. And with that input you get the same output on x86 and x64. The problem can be seen in the code in the second comment. You need to pass a byte array of length 16, but you are passing `'Teste'`, whatever that is. For sure it's not a byte array of length 16. Each version of code in my example has identical behaviour as the original asm function. – David Heffernan Jun 10 '13 at 13:54
  • You are quite right to test carefully the code in my answer. However, the test must be done correctly. Use byte arrays and make sure that `Mem1` and `Mem2` have the same length. You can use random data if you wish. Compare the original asm against the Pascal in my answer. Or use the original asm to generate test case data. And then compare x86 and x64 versions against that test case data. – David Heffernan Jun 10 '13 at 13:56
  • Oh Sorry, I'am passing a LongInt=1414743380 in Mem2. On x86 and x64. – user2470881 Jun 10 '13 at 13:59
  • A longint is no good. As I said you need an object of size 16. You are passing 4 bytes which means the remaining 12 bytes of `Mem2` are ill-defined. Pass a byte array of length 16 and observe that all versions of code in my answer match the original asm version. I think you need to spend a little time understand the inputs to this `XorMem` function a little better. – David Heffernan Jun 10 '13 at 14:02
  • Watch out with NativeInt and older Delphi versions: in Delphi2007 NativeInt is an Int64 instead of an Int32. – Ritsaert Hornstra Jun 10 '13 at 14:44
  • 1
    @RitsaertHornstra That's a fair point. I'm using `NativeUInt` which won't compile at all in D2007 if memory serves. In any case, since the question is specifically about x64 it's safe to assume XE2 or later here. – David Heffernan Jun 10 '13 at 14:45
  • Now it's working, thanks guys! The problem was the Count parameter, I was passing 16 and it's 4. – user2470881 Jun 10 '13 at 16:22