Forcing RIP-relative addressing?

Question

Is there any way to force a compiler or an assembler to only generate RIP-relative addressing code?

I am trying to find a way to provide a mapping from conventional programming models to a Turing complete abstract model of computation.

This is definitely an [XY problem](https://meta.stackexchange.com/q/66377/386992). The "Y" is `-fPIC`. — Joseph Sible-Reinstate Monica, Aug 26 '20 at 00:42
@PeterCordes I am not here to address the Turing complete aspects of what I am proposing, that was merely background info for why I am seeking an answer. — polcott, Aug 26 '20 at 01:17
How would you suggest that `static char *p = &some_char_somewhere;` be pc-relative? Which pc would that static assignment be relative to? — mevets, Aug 26 '20 at 01:37
@mevets That one seems easy. It would be relative to code that accesses it. — polcott, Aug 26 '20 at 02:06
@polcott The code that addresses which one? I could `mov p(%rip), %rax`, but the value it just moved into rax is, itself, a pointer which is absolute. — mevets, Aug 26 '20 at 03:46

Peter Cordes · Accepted Answer · 2020-08-26T04:22:23.460

This seems to be related to your discussion in comments on
Is C actually Turing-complete? on cs.SE (which I happened to see recently before you posted this).

Note that PC-relative addressing doesn't help you achieve unbounded storage. The required data size can be an unbounded amount larger than the code size, so the offset part of PC-relative addressing would need to be of unbounded size. (And it normally is only usable for static storage.)

I thought you were suggesting having pointers that were relative to their own addresses (not to code), which would still require unbounded register widths for a traditional ISA like x86-64 so you might as well just use the Random Access Machine abstract model of computation. x86-64 needs the whole absolute address in a register, or at least 2 parts that sum to the absolute address. ([base + idx*scale] where scale is a 2-bit left shift). add rdi, [rdi] adds a pointed-to offset to a pointer (like C ptr += *ptr), but still needs the result to fit in a register.

If you mean stopping compilers from using absolute memory addressing for static data, then yes, that's easy, use gcc -fPIE or -fPIC.

But if you mean use only [rip + rel32] addressing, never [reg] or any subset of the general [base + idx*scale + disp0/8/32] alternative to [RIP + rel32], then no, of course not for real-world compilers. RIP-relative addressing can only access static storage, so limiting yourself to that would mean no stack space and no pointers. x86-64's only RIP-relative addressing mode is [rip + rel32] where the rel32 is a constant embedded in the machine code, not a register value.

(Perhaps you could use self-modifying code to modify the rel32 of a RIP+rel32 addressing mode, but no mainstream compiler will do that. It's not obvious how you'd manage re-entrancy for stack space with only one copy of a function's machine code that you're modifying, but perhaps keeping the right data in stack space would let you restore the caller's rel32 offsets).

In hand-written asm, you can of course do whatever you want, but limiting yourself to rewriting rel32 displacements makes it (strictly?) less powerful than vanilla x86-64, not more Turing complete.

If you're looking for an addressing mode like [PC + other_register], I think 32-bit ARM has that. It has indexed addressing, and the program counter is accessible as one of the 16 general-purpose registers (unlike in AArch64). So that would let you do PC-relative indexing of static arrays. Again, not that this helps in any obvious way. For any given instruction at a fixed PC to address any of an unbounded number of memory locations, the "other register" would have to have unbounded width.

Unbounded Turing-complete C:

I believe this is impossible unless you loosen the language to remove the fact that every type (including pointers) has some fixed width decided ahead of time, not based on the size of the input you want to deal with.

A Turing-complete C implementation could call malloc in a loop an unbounded number of times, for example reading lines of input with fgets and adding each line as it arrives into a binary tree with a standard recursive approach. Using a standard node layout based on C pointers:
struct node { struct node *left, *right; const char *str; };. Then later traverse that tree and output the lines in sorted order.

For a tree to work, any existing node needs to be able to point to a newly-allocated note. Section-relative addressing doesn't get you any closer to that, as far as I can see. This binary-tree example might be a good litmus test for unbounded C, involving pointers to other objects with an arrangement depending on the input.

What you describe in comments seems to be writing local parts of a UTM state machine in x86 asm, with each state having its own 2GiB memory space and the ability to jump forward or backward to the next state. There's no clear way to have true random access or real pointers, only within the code for one state.

Using a finite C implementation for each step of a UTM doesn't give you an overall Turing-complete C implementation, it gives you a Turing machine with tape-like non-random-access when your problem sizes exceed what you can do inside one "state" or "memory bank" or whatever you call it.

Point being of course, that `rip` is still a 64 bit register. — Joshua, Aug 26 '20 at 01:27
@Joshua: Yes, or in a hypothetical extension of x86-64 where it has some arbitrary or even unbounded size, its size grows with the program code, not the data. It is a von Neumann architecture so it's not a hard distinction. I guess perhaps jumping around to create new code, and static space near that code, could maybe lead to something, with unbounded RIP but finite other registers? Especially in a machine like ARM that allows PC + reg. But implementing proper pointers still seems impossible; if they have to be finite width but able to address any element of unbounded space. — Peter Cordes, Aug 26 '20 at 01:39
I went down this rabbit hole in college. If you can actually build a Turing machine you have enough power to build first level hypercomputer on top of it as a matter of raw physics. There are enough APIs in certain C environments to build this nonsense on the abstract machine but the real hardware will run out of address lines. — Joshua, Aug 26 '20 at 01:44
@PeterCordes I am trying to define a mapping from RIP addressing to a Turing Complete abstraction that obtains its unlimited memory access by 32-bit signed offsets to instruction pointer of unspecified width. It could use absolute addressing and every other aspect of the x86 language within each 2GB memory block. — polcott, Aug 26 '20 at 01:53
@polcott: In other words, *not* a random access machine. Unbounded storage is only possible by moving sequentially between different 2GiB code+data blocks. So you've just invented a way to make the states of a traditional tape-based Turing machine more complicated. Sounds like you'd want ARM for PC+register addressing modes, though, because x86-64 addressing modes other than RIP+rel32 are fully absolute, not absolute-within-a-section. (Unless you redefine what they mean, in which case it's something like memory bank switching where you can't access any of the old memory after jumping.) — Peter Cordes, Aug 26 '20 at 02:11
@PeterCordes I am trying to define the simplest mapping from an architecture that has a "C" compiler to a Turing complete abstract model of computation. — polcott, Aug 26 '20 at 02:20
@polcott: I thought so, but this doesn't seem to be a step in the right direction. Changing the meaning of memory references to section-absolute fundamentally breaks the way C compilers use pointers, and the entire concept of a C pointer as being able to random-access any data object in the C abstract machine. I don't think there's a way to get what you want by making small tweaks to C compiler output from existing mainstream compilers. — Peter Cordes, Aug 26 '20 at 02:31
@PeterCordes Yes. That was the deeper question that I was asking. None-the-less is does still seem feasible. I am writing a UTM that has x86 code as its Turing Machine description language. — polcott, Aug 26 '20 at 02:39
@polcott: ok, that's fine, like I said you can have x86 or ARM machine code run for each step of the state-machine and then "move the tape" (memory banks). So you could use compiler output to implement code that works on a bounded amount of data within a step. But that doesn't help you create a Turing-complete *C implementation*, only using C in specific / limited ways for parts of a Turing machine. If that was your true overall goal, I don't see it helping with that. But if not, that's fine. — Peter Cordes, Aug 26 '20 at 02:43
@polcott: A Turing-complete C implementation could e.g. call `malloc` in a loop an unbounded number of times, e.g. reading lines of input with `fgets`, and adding each line as it arrives into a binary tree. Then later traverse that tree and output the lines in order. For a tree to work, any existing node needs to be able to point to a newly-allocated note. Section-relative addressing doesn't get you any closer to that, as far as I can see. That binary-tree example might be a good litmus test for unbounded C, involving pointers to other objects with an arrangement depending on the input. — Peter Cordes, Aug 26 '20 at 04:10
@PeterCordes My example was to count to infinity and store the result as space delimited finite strings of ASCII digits. — polcott, Aug 26 '20 at 04:23
@polcott: ah, then yeah my tree example has requirements that contiguous storage across multiple 2GiB chunks won't have. (Eventually a single number will get that long, but the layout is still "simple" without any need for true random access. Hope that helps you grok why pointers are a problem you can't easily side-step with PC-relative or pointer-relative addressing.) — Peter Cordes, Aug 26 '20 at 04:29
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/220470/discussion-between-polcott-and-peter-cordes). — polcott, Aug 26 '20 at 04:39

Forcing RIP-relative addressing?

1 Answers1