how do call graphs resolve function pointers?

Question

I am implementing a call graph program for a C using perl script. I wonder how to resolve call graphs for function pointers using output of 'objdump'? How different call graph applications resolve function pointers? Are function pointers resolved at run time or they can be done statically?

EDIT How do call graphs resolve cycles in static evaluation of program?

Here's a paper about an inexpensive way to get call graph in C with function pointers... http://www.cs.rpi.edu/~milanova/docs/paper_kluw.pdf — smwikipedia, Apr 27 '21 at 02:52

score 3 · Answer 1 · answered Dec 04 '10 at 17:24

Using function pointers is a way of choosing the actual function to call at runtime, so in general, it wouldn't be possible to know what would actually happen statically.

However, you could look at all functions that are possible to call and perhaps show those in some way. Often the callbacks have a unique enough signature (not always).

If you want to do better, you have to analyze the source code, to see which functions are assigned to pointers to begin with.

C language doesn't have much metadata, which makes such analysis almost impossible statically. — smwikipedia, Apr 27 '21 at 02:35

score 3 · Accepted Answer · answered Dec 04 '10 at 19:55

3

It is easy to build a call graph of A-calls-B when the call statement explicitly mentions B. It is much harder to handle indirect calls, as you've noticed.

Good static analysis tools form estimates of the contents of pointer variables by propagating pointer assignments/copies/arithmetic across program data flows (inter and intra-procedural ["global"]) using a variety of schemes, often conservative ("you get too much").

Without such an estimate, you cannot have any idea what a pointer contains and therefore simply cannot make a useful prediction (well, you can use the ultimate conservative estimate that it will go anywhere, but I think you've already rejected that solution).

Our DMS Software Reengineering Toolkit has static control/dataflow/points-to/call graph analysis that has been applied to huge systems (~~25 million lines) of C code, and produced such call graphs. The machinery to do this is pretty complex but you can find it in advanced topics in the compiler literature. I doubt you want to implement this in Perl.

This is easier when you have source code, because you at least reliably know what is code, and what is not. You're trying to do this on object code, which means you can't even eliminate data.

answered Dec 04 '10 at 19:55

Ira Baxter

93,541
22
172
341

if i use the source code to find the call graph, how does the C program parse methods in different files. For eg:- method A defined in fileA and method B in fileB how does call graph traverse through different files? – prap19 Dec 04 '10 at 20:38
The tool reads *all* the source files and computes the dataflows within and across all files. Conceptually this isn't hard; it is rather like having all the source code in one file :-} Implementationally you might face the problem of reading 18,000 compilation units (and their header files) which means you have a big scale fight. – Ira Baxter Dec 04 '10 at 21:57
are function pointers always resolved at run time? is it possible i statically link somehow and get its disassembly output and parse it, then obtain call graphs for these function also? – prap19 Dec 04 '10 at 23:36
There's some confusion here. A function pointer only gets a specific value 0x17234983 at a specific moment at runtime, just like an integer variable gets a specific value at a specific moment. A *static* analysis of "a pointer value" actually produces a *set* of possible abstract values {&foo, &bar, &baz} the pointer *might* take on at runtime. You call graph is a "static analysis" where main { ... foo(x); ... (**p)()* ...} shows that main *may* call foo directly, and main may call foo/bar/baz through p. There isn't any gaurantee that *any* function in the call graph is actualyl called. – Ira Baxter Dec 05 '10 at 00:11
... so no, you can't do a static link and a have function pointers resolved. If this isn't clear to you, you probably shouldn't be building call graphs. – Ira Baxter Dec 05 '10 at 00:17
ya i understand what you are trying to convey and that is where i fumbled while implementing. There are many algorithms that do try to estimate the path and some are over conservative and take all paths.So i wondered how do these call graph applications resolve such problems. – prap19 Dec 05 '10 at 03:45
Well, the first problem is to be able to identify all the paths. Unfortunately, this requires you identify all possible values of function pointers. Which requires you identify all the paths... oops. So the algorithm is a relaxation that essentially repeatedly propagates information along the paths it knows. You need a good compiler textbook on data flow algorithms and iteratives solvers to get a start on understanding how all this works. And then you need to fight that scale battle. – Ira Baxter Dec 05 '10 at 03:56
oh i believe this is gonna be a massive task! – prap19 Dec 05 '10 at 13:28

how do call graphs resolve function pointers?

2 Answers2