I have got the control flow graph dump of gcc in RTL format and have visualized it using graphviz. However, it is still unclear which jumps/calls are direct and which are indirect. Any suggestion for distinguishing them from each other?
-
1Which RTL are you referring to? Right-To-Left, Run-Time-Library, Register-Transfer-Level, ... – Greg Aug 20 '14 at 19:09
-
1Register transfer language – user3684042 Aug 20 '14 at 19:24
-
1Why don't you look at the Gimple representations? (Perhaps [MELT](http://gcc-melt.org/) could be helpful) – Basile Starynkevitch Aug 20 '14 at 22:50
-
I want the cfg be architecture-dependent. – user3684042 Aug 21 '14 at 13:58
1 Answers
A direct call normally contains a symbol_ref as part of the RTL, as the compiler knows exactly the call destination. For an indirect call, there is a memory reference using one or more registers but no symbol exists. Thus the memory RTL sub-expression resembles the standard memory read and write format we see in a mov instruction.
For instance, the first call instruction inside the function do_early_param in Linux kernel 3.19, is a direct call:
callq ffffffffc1657839 <strcmp>
The respective RTL is
(call_insn/i:TI 19 18 134 4 (set (reg:SI 0 ax)
(call (mem:QI (symbol_ref:DI ("strcmp") [flags 0x41] <function_decl 0x2b2f6f0e1a00 strcmp>) [0 __builtin_strcmp S1 A8])
(const_int 0 [0]))) init/main.c:436 631 {*call_value}
(expr_list:REG_DEAD (reg:DI 5 di)
(expr_list:REG_DEAD (reg:DI 4 si)
(expr_list:REG_EH_REGION (const_int 0 [0])
(nil))))
(expr_list:REG_FRAME_RELATED_EXPR (use (reg:DI 5 di))
(expr_list:REG_FRAME_RELATED_EXPR (use (reg:DI 4 si))
(nil))))
Later on, we have an indirect call:
callq *0x8(%rbx)
which translates to
(call_insn:TI 38 37 140 7 (set (reg:SI 0 ax)
(call (mem:QI (mem/f:DI (plus:DI (reg/v/f:DI 3 bx [orig:60 p ] [60])
(const_int 8 [0x8])) [0 MEM[base: p_1, offset: 8B]+0 S8 A64]) [0 *D.45869_10 S1 A8])
(const_int 0 [0]))) init/main.c:439 631 {*call_value}
(expr_list:REG_DEAD (reg:DI 5 di)
(nil))
(expr_list:REG_FRAME_RELATED_EXPR (use (reg:DI 5 di))
(nil)))
(without a symbol_ref). For JMP instructions, have in mind that an indirect JMP might in reality be a tail call, appearing as a CALL_P instruction in the RTL. For instance, in the function mtrr_bp_init we have
jmpq *%rax
which in RTL is
(call_insn/j:TI 144 143 145 25 (call (mem:QI (reg/f:DI 0 ax [orig:151 mtrr_if.32_31->set_all ] [151]) [0 *D.30602_32 S1 A8])
(const_int 0 [0])) arch/x86/kernel/cpu/mtrr/main.c:744 623 {*sibcall}
(expr_list:REG_DEAD (reg/f:DI 0 ax [orig:151 mtrr_if.32_31->set_all ] [151])
(nil))
(nil))
Other than that, usually an indirect jmp will appear as a PARALLEL RTL expression. For instance, inside the function update_mp_table we see
jmpq *-0x3eff7188(,%rax,8)
which in RTL is:
(jump_insn:TI 173 170 174 22 (parallel [
(set (pc)
(mem/u/c:DI (plus:DI (mult:DI (reg:DI 0 ax [orig:182 *mpt_138 ] [182])
(const_int 8 [0x8]))
(label_ref:DI 175)) [0 S8 A8]))
(use (label_ref 175))
]) arch/x86/kernel/mpparse.c:741 613 {*tablejump_1}
(expr_list:REG_DEAD (reg:DI 0 ax [orig:182 *mpt_138 ] [182])
(insn_list:REG_LABEL_OPERAND 175 (nil)))
-> 175)

- 588
- 6
- 21