0

I'm doing some reverse engineering on a ELF 32-Bit executable. Here is the code of the .text section :

 08048080 <.text>:
 8048080:   b8 04 00 00 00          mov    eax,0x4
 8048085:   bb 01 00 00 00          mov    ebx,0x1
 804808a:   b9 a1 91 04 08          mov    ecx,0x80491a1
 804808f:   ba 26 00 00 00          mov    edx,0x26
 8048094:   cd 80                   int    0x80
 8048096:   b8 03 00 00 00          mov    eax,0x3
 804809b:   31 db                   xor    ebx,ebx
 804809d:   b9 88 91 04 08          mov    ecx,0x8049188
 80480a2:   ba 33 00 00 00          mov    edx,0x33
 80480a7:   cd 80                   int    0x80
 80480a9:   31 c9                   xor    ecx,ecx
 80480ab:   b8 80 80 04 08          mov    eax,0x8048080
 80480b0:   bb 23 81 04 08          mov    ebx,0x8048123
 80480b5:   e8 5b 00 00 00          call   0x8048115
 80480ba:   89 ca                   mov    edx,ecx
 80480bc:   b9 19 00 00 00          mov    ecx,0x19
 80480c1:   b8 55 91 04 08          mov    eax,0x8049155
 80480c6:   bb 88 91 04 08          mov    ebx,0x8049188
 80480cb:   d1 ca                   ror    edx,1
 80480cd:   8a 44 08 ff             mov    al,BYTE PTR [eax+ecx*1-0x1]
 80480d1:   8a 5c 0b ff             mov    bl,BYTE PTR [ebx+ecx*1-0x1]
 80480d5:   30 d8                   xor    al,bl
 80480d7:   30 d0                   xor    al,dl
 80480d9:   75 1b                   jne    0x80480f6
 80480db:   49                      dec    ecx
 80480dc:   75 e3                   jne    0x80480c1
 80480de:   b8 04 00 00 00          mov    eax,0x4
 80480e3:   bb 01 00 00 00          mov    ebx,0x1
 80480e8:   b9 24 91 04 08          mov    ecx,0x8049124
 80480ed:   ba 26 00 00 00          mov    edx,0x26
 80480f2:   cd 80                   int    0x80
 80480f4:   eb 16                   jmp    0x804810c
 80480f6:   b8 04 00 00 00          mov    eax,0x4
 80480fb:   bb 01 00 00 00          mov    ebx,0x1
 8048100:   b9 4a 91 04 08          mov    ecx,0x804914a
 8048105:   ba 0b 00 00 00          mov    edx,0xb
 804810a:   cd 80                   int    0x80
 804810c:   b8 01 00 00 00          mov    eax,0x1
 8048111:   31 db                   xor    ebx,ebx
 8048113:   cd 80                   int    0x80
 8048115:   29 c3                   sub    ebx,eax
 8048117:   31 c9                   xor    ecx,ecx
 8048119:   02 08                   add    cl,BYTE PTR [eax]
 804811b:   c1 c1 03                rol    ecx,0x3
 804811e:   40                      inc    eax
 804811f:   4b                      dec    ebx
 8048120:   75 f7                   jne    0x8048119
 8048122:   c3                      ret    

But i don't know what this line is supposed to do :

8048119:    02 08                   add    cl,BYTE PTR [eax]

I know that at this point, EAX gets the address of the <.text> section and ECX = 0 but i don't know where's the point to dereference the pointer that points to this adress : it gives me ECX = 0xcc

Could you explain to me why ? I really appreciate your help. I'm sorry for my bad english.

  • The loop in question seems like the sort of thing used to perform a quick-and-dirty checksum.Add all the bytes in some memory region and then compare to zero. (Though the disassembly ends before any such comparison is made so maybe not). – SoronelHaetir Jun 25 '22 at 17:33
  • @SoronelHaetir yes, i think too. The first loop is executed 163 times. But i don't know why dereferencing a pointer that points to the <.text> section address equals 0xcc ... – Rémi Hoarau Jun 25 '22 at 17:37
  • @xiver77 at 0x80480b5 it calls the first loop. When the first loop is finished, the RET instruction at 0x8048122 makes the instruction pointer points to 0x80480ba. The, the program continues to run. – Rémi Hoarau Jun 25 '22 at 17:48
  • Sorry, something's wrong with me, I just couldn't see that number somehow.. Kindly let me delete my previous comments.. – xiver77 Jun 25 '22 at 17:54
  • 1
    `eax` doesn't point to `.text`, just look at the instruction at `0x80480ab`. The function just loops from `eax` to `ebx` and compute `s = ROL(s + ptr[i], 3)` with `s=0` initially. The compiler implemented the loop by first computing the difference between the start end (`ebx`) and the start one (`eax`) and then looping as many times as this difference value. – Margaret Bloom Jun 25 '22 at 18:46
  • 1
    As for the `0xcc` I assume you used a debugger with a software breakpoint which works by inserting the instruction `int3` into the code. As you might have guessed, `int3` has opcode `0xcc`. If you want to avoid that, use a hardware breakpoint instead. – Jester Jun 25 '22 at 19:09
  • @MargaretBloom Thank you for your anwser. Effectively, there is no effective address loaded at all, so EAX doesn't point to .text but simply gets the value of the address of .text. I think the topic can be closed. – Rémi Hoarau Jun 25 '22 at 19:59
  • 1
    @MargaretBloom: `mov eax,0x8048080` *is* pointing EAX at the first byte of `.text`, on the first iteration of the loop at `8048119`. The disassembly starts with `08048080 <.text>:`. Modern toolchains page-align `.text`, but older `ld` didn't. This looks normal for a `_start` address in 32-bit code with an older `ld`. But yes, it loops using a start and end pointer. – Peter Cordes Jun 25 '22 at 23:00
  • I doubt this was compiler-generated; note the arg-passing in EBX. The lack of function alignment and inlined `int 0x80` system calls could be done with GCC options and inline asm macros, but I don't know a way to get GCC to pass an arg in EBX. – Peter Cordes Jun 25 '22 at 23:05

1 Answers1

1

In the comments, Margaret Bloom said:

eax doesn't point to .text, just look at the instruction at 0x80480ab. The function just loops from eax to ebx and compute s = ROL(s + ptr[i], 3) with s=0 initially. The compiler implemented the loop by first computing the difference between the start end (ebx) and the start one (eax) and then looping as many times as this difference value.

And Jester replied:

As for the 0xcc I assume you used a debugger with a software breakpoint which works by inserting the instruction int3 into the code. As you might have guessed, int3 has opcode 0xcc. If you want to avoid that, use a hardware breakpoint instead.

ChrisGPT was on strike
  • 127,765
  • 105
  • 273
  • 257
  • 1
    as Peter noted, I was wrong. `eax` points inside the `.text` section, possibly the start of it if there's no code before the one you posted. – Margaret Bloom Jun 26 '22 at 07:51