-1

I've learned that reordering instructions can help save clock cycles and avoid Data Hazards.

However, I'm finding it difficult to understand exactly how we can reorder these instructions. The best way I've found so far is by putting them in a table and writing out their specific steps and comparing them, but this takes a lot of time.

Question: Are there any shortcuts or tips/tricks that can help spot immediate improvements in the code without having to put them in a table?

Ryan Russell
  • 739
  • 1
  • 8
  • 21
  • 1
    Think about your instructions in terms of which separate dependency chain they're part of, and interleave dep chains to allow in-order pipelines to find instruction-level parallelism. Do you have a specific micro-architecture in mind? 68000 itself isn't pipelined, is it? – Peter Cordes Jun 07 '20 at 17:28
  • 1
    If you're using anything before a 68020, there's no instruction pipeline and I'm not aware of any stalling possible due to previous instructions still being in flight. – Thomas Jager Jun 07 '20 at 17:56
  • 1
    BTW, my suggestion to think about dependency chains works for true RAW dependencies. Avoiding WAR and WAW anti-dependency hazards tends to be less of a problem on a 2-operand ISA like 68k, although `mov` to the same scratch reg could couple dependencies together on a pipeline without full OoO exec and register renaming. I guess a superscalar in-order pipeline might need to use different scratch regs for different dep chains that you interleave. – Peter Cordes Jun 08 '20 at 03:46
  • You may often save some cycles by avoiding unconditional branches in the most likely path. Also try to arrange your code blocks in a way that most branches are short (less than 128 bytes offset), and especially for smaller subroutines you can consider inlining them (this might work against the "have short branches" optimization). I'll vote this question as "too broad", though. – chtz Jun 08 '20 at 16:02

1 Answers1

1

You probably knew this one, but instead of

JSR foo
JMP bar

you can do

PEA bar
JMP foo

(assuming both functions end in RTS, of course.)

puppydrum64
  • 1,598
  • 2
  • 15