Not sure if this the right community for the question, but bear with me...
On an old Zilog Z80 CPU, it is possible to jump to whatever byte address you want in memory. So that means it is also possible to jump right in the middle of an instruction.
Consider the machine code 21 00 C9
(LD HL, $C900
), which sets the HL register to 0xC900. If you were to jump in front of and skip the 21 (say, by doing JR -2
right after the above instruction), the instruction sequence becomes 00 C9
: NOP
followed by RET
; a completely different thing.
Furthermore, a region of garbage memory or non-executable data that may be interpreted as code, can 'desynchronize' code coming after it. If, say, the last byte of a data region is 21
, as above, then the next two bytes (which are really the start of code) might be interpreted as the immediate value for the LD HL, xxxx
instruction, which in turn can completely change how the block of code is disassembled, because the first two bytes change meaning.
So, my question is: How does a disassembler determine where instruction boundaries are, with these corner cases in mind?