3

This can be accomplished by:

11 pop hl
10 push hl

in 21 cycles. The only alternative I have found is ex (sp),hl, which takes 19 cycles. The downside is that the contents must be exchanged to their original values once I'm done with them, so in practice, this method is even costlier than the first.

Are there any other alternatives?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • I don't believe you will find a faster mechanism on the Z80 for copying the top of stack to HL and be able to retain the stack in its current state without more instructions. – Michael Petch Mar 16 '18 at 20:20
  • Ok. I can live with that. Thanks for confirming my suspicions. –  Mar 16 '18 at 20:26
  • On some variants of the Z80 like the game boy there is `LD HL, (SP+e)` – Michael Petch Mar 16 '18 at 20:30
  • What did put the value into the stack? Can't it write it directly to some memory? (into `ld hl,NNNN 10t` self-modifying it) ... may be worth it especially if you read that value more times than you write it. – Ped7g Mar 16 '18 at 21:12
  • or if you know the `sp` will be always the same value at that point of execution, you can eventually write once at init that value into `ld hl,(NN)` instruction, which is 20t and 4B long (i.e. -1t). – Ped7g Mar 16 '18 at 21:25
  • 1
    Actually 'ld hl,(NN)' has an optimized version on the Z80. While it can be ED6Bxxyy, it is also 2Axxyy which is 16t, 3B. – Zeda Mar 17 '18 at 12:47

1 Answers1

5

If you would like to get the value on the top of the stack into HL, then you already listed pretty much every available option:

          pop hl : push hl                               ; 10+11 = 21t

or

          pop hl : dec sp : dec sp                       ; 10+6+6 = 22t

You can also do self-modified code, which won't be of much benefit:

          ld (addr1+1),sp
addr1:    ld hl,(0)                                      ; 20 + 16 = 36t

Even more awkwardly, you can get SP into HL first:

          ld hl,0 : add hl,sp
          ld a,(hl) : inc hl : ld h,(hl) : ld l,a        ; 10+11 + 7+6+7+4 = 21+24 = 45t

(I mentioned the last two options just in case that in your situation you can somehow benefit from one of them.)

The command

          ex (sp),hl                                     ; 19t

exchanges the value at the top of the stack with the current value of HL, so it is already not exactly what you asked for (even though it is fast). So, the only way to speed this up further would be to cheat in various ways. There are the ways to cheat that seem the most obvious:

  • If you actually know where SP is pointing at, simply reading it would be quicker:

         ld hl,(wherethestackis)                         ; 16t
    
  • If you can control where the stack is exactly, or what the stack actually contains, you can point it onto the command that loads the value into HL, so that you can simply do

         ld hl,thevalueonthestack                        ; 10t
    

I understand that both of these options seem extreme, but I know a lot of highly optimized Z80 codes that benefit from similar tricks. So, please do not dismiss them straightaway.

introspec
  • 234
  • 2
  • 4
  • I appreciate your thoroughness and somewhat unorthodox ideas. Unfortunately, I can't know what `sp` will be ahead of time, and additionally, I have to preserve the stack contents, so `ex (sp),hl` is less efficient than `pop hl : push hl`. I agree with the person who tagged my question "micro-optimization." It is a relatively small gripe with the hardware, I admit, and I think I'll have to settle for my original approach. Thanks much anyways. –  Mar 20 '18 at 02:46