0

For example, below is a piece of C code and its assembly code generated by cc compiler.

// C code (pre K&R C)    
foo(a, b) {
    int c, d;
    c = a;
    d = b;
    return c+d;
}
// corresponding assembly code generated by cc
.global _foo
.text
_foo:
~~foo:
~a=4
~b=6
~c=177770
~d=177766
jsr r5, csv
sub $4, sp
mov 4(r5), -10(r5)
mov 6(r5), -12(r5)
mov -10(r5), r0
add -12(r5), r0
jbr L1
L1: jmp cret

I can understand most of the code. But I don't know what does ~~foo: do. And where do the magic numbers come from in ~c=177770 and ~d=177766. The hardware is pdp-11/40.

Thomas Dickey
  • 51,086
  • 7
  • 70
  • 105
  • Uninitialized variables? – fpmurphy Apr 02 '17 at 11:38
  • 1
    The lines beginning with `~` might be comments. Looking at "Lions' Commentary on UNIX 6th Edition", the assembler shown seems to use a `/` to mark comment lines, but there are no lines starting `~` to illustrate that usage. But I am guessing and I'm not planning to go hunting for Unix V6 assembler for PDP 11/40 on the web, but there's a chance that one or more of the search engines knows where such information is available. – Jonathan Leffler Apr 02 '17 at 23:12
  • 1
    I think the lines beginning with `~` are used by linkers. But cannot find specific materials. –  Apr 03 '17 at 11:24
  • If working on the machine itself all of this is octal not hex. If using a modern computer with the pdp11 backend on gcc that is a different story. Look at the machine code, it is broken into 3 bit sections to make reading it in octal easier...gotta think octal to make the pdp11 easier to understand... – old_timer Apr 16 '17 at 13:09

1 Answers1

0

The tildes look like data which determines the stack usage. You might find it helpful to recall that the pdp-11 used 16-bit integers, and that DEC preferred octal numbers over hexadecimal.

That

jsr r5, csv

is a way of making register 5 (r5) point to some data (perhaps the list of offsets).

The numbers correspond to offsets on the stack in octal. The caller is assumed to do something like

  • push a and b onto the stack (positive offsets)
  • push the return address onto the stack (offset=0)
  • possibly push other stuff in the csv function
  • c and d are local variables (negative offsets, hence the "17777x")

That line

~d=177776

looks odd - I'd expect

~d=177766

since it should be below c on the stack. The -10 and -12 offsets in the register operands look like they're also octal numbers. You should be able to match up the offsets with the variables, by context.

That's just an educated guess: I adapted the jsr+r5 idiom a while back in a text-editor.

The lines with tildes are symbol definitions. A clue for that is in the DECUS C Compiler Reference, found at

ftp://ftp.update.uu.se/pub/pdp11/rsx/lang/decusc/2.19/005003/CC.DOC

which says

  3.3  Global Symbols Containing Radix-50 '$' and '.' 
         ______ _______ __________ ________     ___

    With  this  version  of  Decus C, it is possible to generate and
    access global symbols which contain the Radix-50  '.'  and  '$'.
    The  compiler allows identifiers to contain the Ascii '$', which
    becomes a Radix-50 '$' in the object code.  The AS assembly code
    shows  this  character as a tilde (~).  The underscore character
    (_) in a C program  becomes  a  '.'  in  both  the  AS  assembly
    language  and  in  the  object  code.  This allows C programs to
    access all global symbols:  

            extern int $dsw;  
            .  .  .  
            printf("Directive status = %06o\n", $dsw);  

    The  above  prints  the current contents of the task's directive
    status word.

So you could read

~a=4

as

$a=4

and see that $a is a (more or less) conventional symbol.

Thomas Dickey
  • 51,086
  • 7
  • 70
  • 105