6

I am doing my dissertation and I have to parse and tokenize the source code into individual functions. For every function I would like to extract names of types, called function names and type casts. Is the clang the right tool for that kind of job? If yes how can i do this?

Below is a simple C function. With bold are the extracted items I want:

static char func1(unsigned int a, struct foo *b)
{
    int c = 0;
    struct bar *d;

    if (a == 0) {
        d = func2((int) a);
    } else {
        c = func3((struct bar *) b);
    }

    return c;
}
Vladimir Panteleev
  • 24,651
  • 6
  • 70
  • 114
Andreas Geo
  • 365
  • 2
  • 15

2 Answers2

8

Yes, Clang is the right tool to do this job.

You should take a look a libclang.

You can find enough information on internet, but I personally can recommend two great articles:

Parsing C++ in Python with Clang by Eli Bendersky

Introduction to libclang by Mike Ash

If you prefer to watch videos, then I can recommend to look at the presentation on libclang here: 2010 LLVM Developers' Meeting, look for libclang: Thinking Beyond the Compiler

AlexDenisov
  • 4,022
  • 25
  • 31
  • Thank you for your suggestions. After doing some reading of my own I came to clang's documentation. It is saying that you can use clang for different stages of compilation. If I am correct I am interested for the preprocessing stage and the parsing and semantic analysis stage. Those stages can be invoked using the -fsyntax-only flag. However using this flag clang does not return the AST file as said in the documentation. Any ideas? Am I missing some tools for example from llvm, thats why it doesnot generate the AST? – Andreas Geo Jun 05 '16 at 18:18
  • I can't say for sure about the purpose of `-fsyntax-only`, but I assume it's there for easier debugging and maybe for some static analysis, when we don't need to proceed after semantic analysis is done. `libclang` covers everything you need, it is basically does everything until code generation, it includes preprocessing (#include's, C macros), lexing, parsing, and semantic analysis. – AlexDenisov Jun 05 '16 at 18:47
  • I got an idea from reading the articles you posted. I can understand that clang is a compiler but also they provide all of its individual components to create my own preprocessor tool which I can configure as I like using their libraries and functions? Because this is out of the scope of my project, is there soething which is ready to use? – Andreas Geo Jun 05 '16 at 19:48
0

I would like to recommend LibTooling. If you only need Preprocessing, PreprocessOnlyAction is helpful. If information of Abstract Syntax Tree is needed, you better choose ASTFrontendAction.

Layne Liu
  • 452
  • 5
  • 10