0

I have got the control flow graph dump of gcc in RTL format and have visualized it using graphviz. However, it is still unclear which jumps/calls are direct and which are indirect. Any suggestion for distinguishing them from each other?

Greg
  • 18,111
  • 5
  • 46
  • 68
user3684042
  • 651
  • 2
  • 6
  • 16

1 Answers1

2

A direct call normally contains a symbol_ref as part of the RTL, as the compiler knows exactly the call destination. For an indirect call, there is a memory reference using one or more registers but no symbol exists. Thus the memory RTL sub-expression resembles the standard memory read and write format we see in a mov instruction.

For instance, the first call instruction inside the function do_early_param in Linux kernel 3.19, is a direct call:

callq  ffffffffc1657839 <strcmp>

The respective RTL is

(call_insn/i:TI 19 18 134 4 (set (reg:SI 0 ax)                          
    (call (mem:QI (symbol_ref:DI ("strcmp") [flags 0x41] <function_decl 0x2b2f6f0e1a00 strcmp>) [0 __builtin_strcmp S1 A8])
        (const_int 0 [0]))) init/main.c:436 631 {*call_value}               
 (expr_list:REG_DEAD (reg:DI 5 di)                                          
    (expr_list:REG_DEAD (reg:DI 4 si)                                       
        (expr_list:REG_EH_REGION (const_int 0 [0])                          
            (nil))))                                                        
(expr_list:REG_FRAME_RELATED_EXPR (use (reg:DI 5 di))                       
    (expr_list:REG_FRAME_RELATED_EXPR (use (reg:DI 4 si))                   
        (nil))))

Later on, we have an indirect call:

 callq  *0x8(%rbx)

which translates to

  (call_insn:TI 38 37 140 7 (set (reg:SI 0 ax)                            
        (call (mem:QI (mem/f:DI (plus:DI (reg/v/f:DI 3 bx [orig:60 p ] [60])    
                        (const_int 8 [0x8])) [0 MEM[base: p_1, offset: 8B]+0 S8 A64]) [0 *D.45869_10 S1 A8]) 
            (const_int 0 [0]))) init/main.c:439 631 {*call_value}               
     (expr_list:REG_DEAD (reg:DI 5 di)                                          
        (nil))                                                                  
    (expr_list:REG_FRAME_RELATED_EXPR (use (reg:DI 5 di))                       
        (nil)))

(without a symbol_ref). For JMP instructions, have in mind that an indirect JMP might in reality be a tail call, appearing as a CALL_P instruction in the RTL. For instance, in the function mtrr_bp_init we have

 jmpq   *%rax

which in RTL is

 (call_insn/j:TI 144 143 145 25 (call (mem:QI (reg/f:DI 0 ax [orig:151 mtrr_if.32_31->set_all ] [151]) [0 *D.30602_32 S1 A8])
        (const_int 0 [0])) arch/x86/kernel/cpu/mtrr/main.c:744 623 {*sibcall}   
     (expr_list:REG_DEAD (reg/f:DI 0 ax [orig:151 mtrr_if.32_31->set_all ] [151])
        (nil))                                                                  
    (nil))

Other than that, usually an indirect jmp will appear as a PARALLEL RTL expression. For instance, inside the function update_mp_table we see

jmpq   *-0x3eff7188(,%rax,8)

which in RTL is:

    (jump_insn:TI 173 170 174 22 (parallel [                              
            (set (pc)                                                           
                (mem/u/c:DI (plus:DI (mult:DI (reg:DI 0 ax [orig:182 *mpt_138 ] [182])
                            (const_int 8 [0x8]))                                
                        (label_ref:DI 175)) [0 S8 A8]))                         
            (use (label_ref 175))                                               
        ]) arch/x86/kernel/mpparse.c:741 613 {*tablejump_1}                     
     (expr_list:REG_DEAD (reg:DI 0 ax [orig:182 *mpt_138 ] [182])               
        (insn_list:REG_LABEL_OPERAND 175 (nil)))                                
 -> 175)
nettrino
  • 588
  • 6
  • 21