How does "self-modified link" work in Pegasus programming?

Question

I've been looking at a simulation of the 1950s 'Pegasus' computer, and come across the term "self-modified link". How does this work?

Random guess based on the name: early computers without modern "call" or "branch and link" instructions used to pass a return address by storing an address before the target function's machine code. (And no this isn't re-entrant.) Possibly a self-modified link could be something to do with that, like writing a return address into a jump instruction? If link has the same meaning as it does in "branch and link" in modern RISC ISAs. — Peter Cordes, Feb 13 '20 at 12:13
According to https://archive.org/stream/bitsavers_ferrantipemingMan1962_40324310/PegasusProgrammingMan_1962_djvu.txt, it does seem to be something to do with jumps and subroutines. But the diagrams are garbled so I can't see what their example is. — Peter Cordes, Feb 13 '20 at 12:14
This question might be better suited to https://retrocomputing.stackexchange.com/ since it's about an ancient computer. — Peter Cordes, Feb 13 '20 at 12:16
Looking at the link posted by Peter Cordes, there are no call or return instructions, just jump instructions, and instead there are programming conventions for storing "return" addresses in memory or in register (XI). Also due to limited memory space, code is kept in blocks on external storage, and loaded as needed to be run. Unlike conventional subroutines, these may have multiple hard coded branches back into multiple locations of what could be considered the "calling" code. — rcgldr, Feb 13 '20 at 14:43
I don't know about Pegasus, but for the CDC 3000 series and some other computers, an address could be stored into the address field of an instruction. The call instruction stored the return address in the first word of a subroutine, which used a jump indirect to return. A register based alternative could be used, but there was no standard convention for this. IBM mainframes have an "execute" instruction to provide a register to be used as a third operand for an otherwise two variable and one fixed operand instruction. — rcgldr, Feb 13 '20 at 14:46

Erik Eidt · Accepted Answer · 2020-02-20T01:41:28.280

Let's first note that:

blocks of code are loaded into sequential registers for execution
each register holds 39-bits: either a data word -or- two instructions.

The Pegasus computer supported subroutine/function calls by passing two instructions (in one register) on how the callee should return — as data, i.e. as parameter. The callee would then take that parameter, store it into its own local memory, and transfer control to that — thus executing the two instructions that were passed as data by the caller.

The caller would do this:

    load instructions from L1: into X1 register
    ... pass other parameters ...
    load callee's first block into U0
    transfer control to U0 at start
L1: load instructions from main block back into U0
    transfer control to U0 at desired caller resume point (e.g. L2)
L2: ...caller resumes...

And the caller would do this:

    ... compute something ...
    store X1 into L4              # writes the two instruction pair in X1 into L4
    transfer control to L4        # may or may not be present depending on location of L4
L4: # starts empty but contains the two instructions that came from L1
    # so control is transferred back to the caller

The sequence at L1 is never directly executed, but rather is a template for instructions to be passed as a parameter to the callee, who will store that template into its own local memory and then execute that copy.

They use the term "cue" for call (subroutine invocation, e.g. transfer of control to callee) and "link" for return (transfer of control from subroutine back to the proper caller) (and the term "orders" for machine code instructions).

So, when they speak of "self-modfying link", they mean that the mechanism for a subroutine to return to its caller uses self-modfying code, which is code that alters its own instruction(s) while executing, i.e. code that writes into instruction memory for imminent execution. Here it is the write of X1 (a copy of the code at L1) to L4, which gets executed just after being written.

Self-modifying code is generally frowned upon these days as it has several negative implications: the instruction cache needs to be kept in sync with the self-modification, which interferes with performance, and, the instruction memory must be writable, which interferes with security.

Self modification was often used in older processors, due to what we might today think of as missing instructions in the instruction set — for example the PDP-8 had I/O instructions but the I/O port was hard coded into the instruction — there was no indirect I/O port access. Thus, in order for a subroutine to access an I/O port taken as a parameter, self-modifying code was used to construct an I/O instruction to that port. (A large switch statement could have also been used, but would take quite a bit more code.) These older processors did not have instruction caches, nor did programmers & development tools enforce separation of instruction sections from data sections (i.e. they didn't attempt to protect instructions).

In the case of the Pegasus, there are no indirect branches, thwarting the ability to return to a caller-provided address. As it turns out, the caller's code may have also actually been overlaid — the ability to pass two instructions to have the callee execute allowed both to restore a block of registers for the caller's code and the jump within that block.

(And passing fully formed instructions as a parameter resulted in fewer instructions for return linkage than the other self-modifying alternative: to dynamically construct a jump instruction given an address parameter — that would have taken several more instructions for doing bitfield manipulations.)

https://en.wikipedia.org/wiki/Ferranti_Pegasus http://bitsavers.org/pdf/ferranti/pegasus/PegasusProgrammingMan_1962.pdf

Yes, I see that's the way to call a subroutine, but what's this 'self-modified'? — John White, Feb 14 '20 at 08:18
@JohnWhite: I assume because the callee stores some instructions into the path of execution, i.e. self-modifying code. And the purpose of the code is to enable call/return, so self-modifying link seems to fit well enough. — Peter Cordes, Feb 14 '20 at 10:53
I agree with @PeterCordes' assumuption, and also, they use the term "cue" for invocation, and "link" for return from subroutine. — Erik Eidt, Feb 14 '20 at 14:57

score 0 · Answer 2 · edited Apr 11 '20 at 05:59

0

From page 119 of George Felton's Pegasus Programming Manual.
This looks like the real answer.

edited Apr 11 '20 at 05:59

Peter Cordes

328,167
45
605
847

answered Apr 11 '20 at 05:50

John White

131
1
3
19

How does "self-modified link" work in Pegasus programming?

2 Answers2

Linked