Is it possible to realize subroutine without indirect addressing?

Question

I am working on the Simple-Compiler project in Deitel's book C how to program. Its main goal is to generate a compiler for an advanced language called SIMPLE and the relevant machine language is called SIMPLETRON.

I've completed some basic features for this compiler but am now stuck with an enhanced requirement -- to realize gosub and return (subroutine features) for SIMPLE language.

The main obstacle here is that SIMPLETRON doesn't support indirect addressing, which means the strategy to use stack for returning addresses of subroutines can't work. In this case, is it possible to somehow make subroutines work?

PS: I searched this issue and found an relevant question here. It seemed self-modifying code might be the answer, but I failed to find specific resolutions and thus I still raised this question. Moreover in my opinion machine instructions for SIMPLETRON has to be extended to make self-modifying code work here, right?

Background information for SIMPLETRON machine language:

It includes only one accumulator as register.

All supported machine instructions as below:

Input/output operations

#define READ 10: Read a word from the terminal into memory and with an operand as the memory address.

#define WRITE 11: Write a word from memory to the terminal and with an operand as the memory address.

Load/store operations

#define LOAD 20: Load a word from memory into the accumulator and with an operand as the memory address.

#define STORE 21: Store a word from the accumulator into memory and with an operand as the memory address.

Arithmetic operations

#define ADD 30: Add a word from memory to the word in the accumulator (leave result in accumulator) and with an operand as the memory address.

#define SUBTRACT 31: Subtract a word ...

#define DIVIDE 32: Divide a word ...

#define MULTIPLY 33: Multiply a word ...

Transfer of control operations

#define BRANCH 40: Branch and with an operand as the code location.

#define BRANCHNEG 41: Branch if the accumulator is negative and with an operand as the code location.

#define BRANCHZERO 42: Branch if the accumulator is zero and with an operand as the code location.

#define HALT 43: End the program. No operand.

Because I don't think current instructions support `STORE $ + 1` — Orpheus, Jan 08 '17 at 17:11
`STORE` can only be used with a constant operand as a memory address, not a variable. — Orpheus, Jan 08 '17 at 17:14
It's not a variable. `$` would be assembler syntax for "the address we're storing the instruction currently being assembled in". `$ + 1` is resolved by the assembler and written into machine code as a constant. — melpomene, Jan 08 '17 at 17:16
Seems like an exact duplicate of the linked question, right? — vgru, Jan 08 '17 at 17:22
Oh I see. So this means storing the value in accumulator to memory space for next instruction, which somehow simulates indirect addressing. — Orpheus, Jan 08 '17 at 17:30
When machines were actually as simple as simpltron, they used self-modifying code to implement indirect addressing: code that wrote memory locations that were later run as code. For a stack, you'd modify the addresses in load and store instructions. For indirect branching, you'd modify the address in a jump instruction. If this is the actual machine description: http://faculty.cs.niu.edu/~karonis/Classes/01spring/580M/simpletron/node2.html, then this won't be hard. — Gene, Jan 08 '17 at 17:31
@Gene Thanks for the explanation of self-modifying code. Very helpful. — Orpheus, Jan 08 '17 at 17:44
For ROM code, the indirect memory operation can be done by setting up in RAM as part of program initialisation, the few instructions needed to make a subroutine, into which you "poke" the required address before calling it. — Weather Vane, Jan 08 '17 at 17:59

score 3 · Accepted Answer · answered Jan 08 '17 at 17:26

I'm not familiar with SIMPLE or SIMPLETRON, but in general I can think of at least 3 approaches.

Self-modifying code

Have a BRANCH 0 instruction at the end of each subroutine, and before that, code to load the return address into the accumulator and STORE it into the code itself, thus effectively forming a BRANCH <dynamic> instruction.
Static list of potential callers

If SIMPLE doesn't have indirect calls (i.e. every gosub targets a statically known subroutine), then the compiler knows the list of possible callers of each subroutine. Then it could have each call pass a unique argument (e.g. in the accumulator), which the subroutine can test (pseudocode):
```
SUBROUTINE:
...
if (arg == 0)
    branch CALLER_1;
if (arg == 1)
    branch CALLER_2;
if (arg == 2)
    branch CALLER_3;
```
Inlining

If SIMPLE doesn't allow recursive subroutines, there's no need to implement calls at the machine code level at all. Simply inline every subroutine into its caller completely.

Approach 1 is the exact answer to my question. Thanks! – Orpheus Jan 09 '17 at 04:15 — Orpheus, Jan 09 '17 at 04:15

score 2 · Answer 2 · answered Jan 08 '17 at 18:19

Yes, you can do this, even reasonably, without self-modifying code.

You turn your return addresses into a giant case statement. The secret is understanding that a "return address" is just a way to get back to point of the call, and that memory is just a giant array of named locations.

Imagine I have a program with many logical call locations, with the instruction after the call labelled:

     CALL    S
 $1: ...
     ...
     CALL    T
 $2: ...
     ...
     CALL    U
 $3: ...

We need to replace the CALLs with something our machine can implement. Let's also assume temporarily that only one subroutine call is active at any moment.

Then all that matters, is that after a subroutine completes, that control returns to the point after the call.

You can cause this by writing the following SIMPLETRON code (I'm making up the syntax). By convention I assume I have a bunch of memory locations K1, K2, ... that contain the constants 1, 2, .. etc for as many constants as a I need.

 K1:  1
 K2:  2
 K3:  3
    ...
    LOAD   K1
    JMP    S
  $1: ...
     ...
    LOAD   K2
    JMP    T
  $2: ...
     ...
    LOAD   K3
    JMP    U
  $3:....

  S:  STORE RETURNID
     ...
     JMP  RETURN

  T:  STORE RETURNID
     ...
     JMP  RETURN

   U: STORE RETURNID
     ...
     JMP   RETURN

  RETURN:  LOAD RETURNID
     SUB    K1
     JE     $1
     LOAD   RETURNID
     SUB    K2
     JE     $2
     LOAD   RETURNID
     SUB    K3
     JE     $3
     JMP    *     ; bad return address, just hang

In essence, each call site records a constant (RETURNID) unique to that call site, and "RETURN" logic uses that unique ID to figure out the return point. If you have a lot of subroutines, the return logic code might be quite long, but hey, this is a toy machine and we aren't that interested in efficiency. You could always make the return logic into a binary decision tree; then the code might be long but it would only take log2(callcount) to decide how to get back, not actually all that bad).

Let's relax our assumption of only one subroutine active at any moment. You can define for each subroutine a RETURNID, but still use the same RETURN code. With this idea, any subroutine can call any other subroutine. Obviously these routines are not-reentrant, so they can't be called more than once in any call chain.

We can use this same idea to implement a return stack. The trick is to recognize that a stack is merely a set of memory locations with an address decoder that picks out members of the stack. So, lets implement PUSH and POP instructions as subroutines. We change our calling convention to make the caller record the RETURNID, leaving the accumulator free to pass a value:

        LOAD   K1
        STORE  PUSHRETURNID
        LOAD   valuetopush
        JMP    PUSH
     $1:
        LOAD   K2
        STORE  POPRETURNID
        JMP    POP
     $2:...

     TEMP:
     STACKINDEX: 0   ; incremented to 1 on first use
     STACK1:  0      ; 1st stack location
     ...
     STACKN:  0

     PUSH:  STORE TEMP   ; save value to push
         LOAD PUSHRETURNID ; do this here once instead of in every exit
         STORE RETURNID
         LOAD STACKINDEX   ; add 1 to SP here, once, instead of in every exit
         ADD  K1
         STORE STACKINDEX
         SUB  K1
         JE   STORETEMPSTACK1
         LOAD STACKINDEX
         SUB  K2
         JE   STORETEMPSTACK2
         ...
         LOAD STACKINDEX
          SUB  Kn
         JE   STORETEMPSTACKn
         JMP   *           ; stack overflow

   STORETEMPSTACK1:
         LOAD   TEMP
         STORE  STACK1
         JMP    RETURN

   STORETEMPSTACK2:
         LOAD   TEMP
         STORE  STACK2
         JMP    RETURN

         ...

       POP:  LOAD   STACKINDEX
         SUB    K1        ; decrement SP here once, rather than in every exit
         STORE  STACKINDEX
         LOAD   STACKINDEX
         SUB    K0
         JE     LOADSTACK1   
         LOAD   STACKINDEX
         SUB    K1
         JE     LOADSTACK2
         ...
    LOADSTACKn:
         LOAD   STACKn
         JMP    POPRETURN

    LOADSTACK1:
         LOAD   STACK1
         JMP    RETURNFROMPOP

    LOADSTACK2:
         LOAD   STACK2
         JMP    RETURNFROMPOP

    RETURNFROMPOP: STORE TEMP
         LOAD   POPRETURNID
         SUB    K1
         JE     RETURNFROMPOP1
         LOAD   POPRETURNID
         SUB    K2
         JE     RETURNFROMPOP2
         ...

    RETURNFROMPOP1:  LOAD TEMP
         JMP    $1

    RETURNFROMPOP2:  LOAD TEMP
         JMP    $2

Note that we need RETURN, to handle returns with no value, and RETURNFROMPOP, that handles returns from the POP subroutine with a value.

So these look pretty clumsy, but we can now realize a pushdown stack of fixed but arbitrarily large depth. If we again make binary decision trees out the stack location and returnID checking, the runtime costs are only logarithmic in the size of the stacks/call count, which is actually pretty good.

OK, now we have general PUSH and POP subroutines. Now we can make calls that store the return address on the stack:

        LOAD  K1   ; indicate return point
        STORE PUSHRETURNID
        LOAD  K2   ; call stack return point
        JMP   PUSH
     $1: LOAD argument  ; a value to pass to the subroutine
         JMP  RECURSIVESUBROUTINEX
        ; returns here with subroutine result in accumulator
     $2:

   RECURSIVESUBROUTINEX:
        ...compute on accumulator...
        LOAD  K3   ; indicate return point
        STORE PUSHRETURNID
        LOAD  K4   ; call stack return point
        JMP   PUSH
     $3: LOAD  ...  ; some revised argument
         JMP  RECURSIVESUBROUTINEX
     $4: ; return here with accumulator containing result
         STORE  RECURSIVESUBROUTINERESULT
         LOAD K5
         STORE POPRETURNID
         JMP   POP
     $5: ; accumulator contains return ID
         STORE  POPRETURNID
         LOAD  RECURSIVESUBROUTINERESULT
         JMP   RETURNFROMPOP

That's it. Now you have fully recursive subroutine calls with a stack, with no (well, faked) indirection.

I wouldn't want to program this machine manually because building the RETURN routines would be a royal headache to code and keep right. But a compiler would be perfectly happy to manufacture all this stuff.

Thanks for all the explanation. But I think self-modifying code is a much easier way than this. Thus I marked that as the answer. — Orpheus, Jan 09 '17 at 04:15
If you work with a real microprocessor that has a very limited instruction set, you will pretty much find that can't write self-modifying code, because the code is in a ROM. Obviously you are playing with a completely fake machine so you can do what you want, (I wrote this answer for the pure intellectual pleasure) but your experience with the completely fake machine won't be very realistic. In your shoes, I'd spend my time working on another toy computer instruction set that was more realistic than this one. (If you did like my answer, you can still upvote it without accepting it) — Ira Baxter, Jan 09 '17 at 04:19
I did like your answer and tried to upvote it. But my current reputation is less than 15, so it won't change the post score. Sorry about that, especially considering your efforts on writing such a detailed&helpful resolution.. — Orpheus, Jan 09 '17 at 04:37

rcgldr · Answer 3 · 2017-01-09T04:54:38.173

Although there's no way to get the current instruction's location from within the SIMPLE instruction set, the assembler can keep track of instruction locations in order to generate the equivalent of return instructions.

The assembler would generate a branch to address instruction in the program image to be used as a return instruction, then to implement a call it would generate code to load a "return instruction" and store it at the end of a subroutine before branching to that subroutine. Each instance of a "call" would require an instance of a "return instruction" in the program image. You may want to reserve a range of variable memory to store these "return instructions".

Example "code" using a call that includes the label of the return instruction as a parameter:

        call    sub1, sub1r
        ; ...
sub1:   ; ...
sub1r:  b       0              ;this is the return instruction

Another option would be something similar to MASM PROC and ENDP, where the ENDP would hold the return instruction. The call directive would assume that the endp direction holds the branch to be modified and the label would be the same as the corresponding proc directive.

        call    sub1
        ; ...

sub1    proc                   ;subroutine entry point
        ; ...
sub1    endp                   ;subroutine end point, "return" stored here

The issue here is that the accumulator would be destroyed by the "call" (but not affected by the "return"). If needed, subroutine parameters could be stored as variables, perhaps using assembler directives for labeling:

sub1    parm1                  ;parameter 1 for sub1
        ;....
        load    sub1.parm1     ;load sub1 parameter 1

Thanks for the answer. This seemed to be a brief summary of the above one. — Orpheus, Jan 09 '17 at 04:15
@OrpheusQu - I was thinking of having an assembler directive for call, rather than having the user code around this limitation. The assembler call directive could use a label parameter for the "return" instruction to be modified, and the assembler would maintain a set of these return instructions in an assembler generated literal (variable) pool. — rcgldr, Jan 09 '17 at 04:28
Got you! I misunderstood your idea and now this is another different direction to solve this problem~ — Orpheus, Jan 09 '17 at 04:42
@OrpheusQu - I updated my answer. It doesn't deal with reentrant or recursive functions though. — rcgldr, Jan 09 '17 at 04:56

Is it possible to realize subroutine without indirect addressing?

3 Answers3