1

I am currently disassembling a test program I made to better understand how things work under the hood. I am using Cheat Engine, which has disassembly and debugging capabilities, similar to Ollydbg.

I noticed that there are inconsistencies in the endianness used to display data. Here is a picture:

enter image description here

If the data is being displayed in a fixed size container, such as registers, immediate instruction operands, or a stack item, then the data will be displayed in big-endian format.

If the data is displayed in a continuous container, such as the memory dump window in the bottom left, or in the machine code bytes dump under the "bytes" column at the top left, the same data will be displayed in little endian.

However, I am unsure if this is a feature of the display, or a feature of the ISA (data is converted in endianness when written to certain locations, or something).

Why am I seeing this?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Abraham
  • 93
  • 5
  • It's a feature. The memory dump is just bytes, it shows you the bytes as they are in memory. Given that no size or type is associated with the dump that's all it can do anyway. For registers and stuff in instructions the disassembler knows how to decode to human readable form. – Jester Nov 14 '21 at 23:49

2 Answers2

2

That's not intended to be "big endian", rather just a numeric value, written in normal place-value notation, printed without spaces. Like how 3 * 5 = 15, not 51 decimal.

The actual dump of machine-code bytes still uses the machine-code byte order, despite grouping together without spaces.

But the disassembly of course uses numeric values, just like you'd write in asm source for any assembler. There are no mainstream x86 assemblers that take numeric hex values in memory order rather than place-value order.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
1

Generally, hex digits glued together like 61616100 should be interpreted as a value, while separated ones like 00 61 61 61 as a byte sequence.

But your debugger has made a poor choice on representation of the immediate operands in instruction machine code bytes. It should either have separated the bytes (at least by a thin space if there's still a desire to group them), or shown them as a value.

The latter option is not unheard of. Agner Fog's objconv, for example, disassembles the instruction C7 04 24 00 61 61 61 (taken from the OP) as

C7. 04 24, 61616100   mov dword [esp], 1633771776

Note the 61616100 that's a value, rather than several bytes stuck together.

So, in conclusion, just get used to this (and only this) place having this peculiarity, or make a feature request for the authors of the debugger to add an option to show immediates as values instead of bytes glued together.

Ruslan
  • 18,162
  • 8
  • 67
  • 136
  • Note that the actual question is about why the asm-source operand is a value (which the querent is calling "big endian") rather than a byte dump, not about the grouping in the byte column (which is the only thing *you and I* found surprising here). Are you indirectly answering that by saying that the lack of spaces *should* be indicating a value, and the only problem is the lack of spaces in the "bytes" column? If so I agree. (And agree with everything you said in the answer. I just think this doesn't directly answer the question asked.) – Peter Cordes Nov 15 '21 at 16:20
  • @PeterCordes hmm, indeed I've misinterpreted the question a bit. Yes, I generally consider unseparated hex digits as a value and separated ones as bytes. I'll edit the answer to reflect this. – Ruslan Nov 15 '21 at 16:22
  • Yeah, that's 100% what I was expecting based on spacing, and even started to write my answer under the assumption that the grouped immediate in the bytes column *was* a value. (And was going to say that having a numeric *value* for addresses when the disassembly used a symbol name was in some ways a good thing, although it would have been weird to have something not in byte-order in the bytes column, so on balance the grouping is less bad). NASM and YASM listings (`nasm -l/dev/stdout`) do grouping in their bytes columns, but they never use spaces anyway and use parens or brackets. – Peter Cordes Nov 15 '21 at 16:40