6

I would like to know how to use GCC as a library to parse C/C++/Java/Objective C/Ada code for my program. I want to bypass prepocessing and prefix all the functions that are user written with a prefix My. like so Print(); becomes MyPrint(); I also wish to do this with the variables.

undur_gongor
  • 15,657
  • 5
  • 63
  • 75
zeitue
  • 1,674
  • 2
  • 20
  • 45
  • 5
    That's a fairly large requirement. What have you tried? What specific problems are you facing? Have you read the man page on the gcc chain (for without preprocessing and whatnot)? Also, preprocessing will probably not be the only step required to get code to work inside of your program. What exactly is it that you're trying to do? – Corbin Nov 15 '11 at 23:52
  • I have tried using clang but i can never get it to compile. I:m not to good with paring. whats the gcc chain? I want to skip the preprocessing because I don't want the headers or macros in the files . basically what I am trying to do is rename functions and variable with a prefix. – zeitue Nov 16 '11 at 00:00
  • Ah, I meant to say gcc tool chain. And really that was a broader term than I should have used. I should have just said the gcc man page, and maybe the link man page depending on what exactly you're doing. – Corbin Nov 16 '11 at 00:04
  • Mooing Duck are you saying you can use Lua for c++ parsing? I don't know Lua but I'd learn it if it would work. Also I have read the GCC man pages – zeitue Nov 16 '11 at 00:08
  • 1
    Usually when someone wants to include GCC, they are just trying to make add-ons for their project, which is _much_ easier done with a scripting language. If your goal is to alter source code, then you'll want to interface with GCC. – Mooing Duck Nov 16 '11 at 00:12
  • Have you tried [http://ctags.sourceforge.net/](ctags)? It can be used to locate the functions and variables to the character and then you probably can modify their names. – Tamás Szelei Nov 16 '11 at 00:45
  • Ctags does not seam to do what I want. – zeitue Nov 16 '11 at 02:28
  • related http://programmers.stackexchange.com/questions/189949/is-there-a-way-to-use-gcc-as-a-library/323821#323821 – Ciro Santilli OurBigBook.com Jul 02 '16 at 09:40

6 Answers6

4

You can look here:
http://codesynthesis.com/~boris/blog/2010/05/03/parsing-cxx-with-gcc-plugin-part-1/

This is description of how to use gcc plugin interface to parse C++ code. Other language should be handled in the same manner.

Also you can try pork from mozilla:
https://wiki.mozilla.org/Pork

When I tried it (pork), I spend hour or so to fix compile problems, but then I can write scripts like this:

rewrite SyncPrimitiveUpgrade {
  type PRLock* => Mutex*
  call PR_NewLock() => new Mutex()
  call PR_Lock(lock) => lock->Lock()
  call PR_Unlock(lock) => lock->Unlock()
  call PR_DestroyLock(lock) => delete lock
}

so it found all type PRLock and replate it with Mutex, also it search call of functions like PR_NewLock and replace it with "new Mutex".

ildjarn
  • 62,044
  • 9
  • 127
  • 211
fghj
  • 8,898
  • 4
  • 28
  • 56
2

You might wish to investigate the sparse C parser. It understands a lot of C (all the C used in the Linux kernel sources, which is a fairly good subset of legal ANSI-C and GNU-C extensions) and provides a few sample compiler backends to provide a lint-like static analysis tool for type checking.

While the code looks very clean and thorough, your task might be easier done via another mechanism -- the example.c included with the sparse source that demonstrates a compiler is 1955 lines long.

sarnold
  • 102,305
  • 22
  • 181
  • 238
2

For C, you cannot do that reliably. If you skip preprocessing you will -- in general -- not have valid C code to be parsed. E.g.

#define FOO
#define BAR
#define BAZ

FOO void BAR qux BAZ(void) { }

How is the parser supposed to recognize this a function definition of qux without doing the preprocessing?

undur_gongor
  • 15,657
  • 5
  • 63
  • 75
  • 1
    what i meant by I want to skip preprocessing it that I don't want the macros to be expanded and I don't want the headers in the final code. – zeitue Nov 16 '11 at 17:57
2

First, GCC is not a library, and is not structured to be one (in contrast to LLVM).

Why (i.e. what for) do you want to parse C, C++, Ada source code?

I would consider (assuming a GCC 4.6 version) extending GCC either thru plugins written in C, or preferably using MELT, a high level domain specific language to extend GCC (disclaimer: I am the main author of MELT).

But using GCC as a library is not realistic at all.

I really think that for what you want to achieve, MELT is the right tool. However, it is poorly documented. Please use the gcc-melt@googlegroups.com list to ask questions.

And be aware that extending GCC does take some amount of work (more than a week perhaps), because you need to partly understand the GCC internal representations.

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
2

forget about GCC, its made as a compiler's parser, not an analysis parser, you'd do way better using something like libclang, a C interface to clang, which can process both C & C++

Necrolis
  • 25,836
  • 3
  • 63
  • 101
  • Several GCC plugins are able to do some analysis, so I do think it may be a relevant tool. – Basile Starynkevitch Nov 16 '11 at 09:18
  • Basile: the problem is they are plugins, clang is built from the ground up as an analyzer and parser, though I suppose a plugin would reduce toolchain length, but tbh, I'd recommend using LLVM + clang over GCC these days – Necrolis Nov 16 '11 at 10:15
  • Being a **compiler plugin** can be viewed not as a problem, but as a **decisive advantage**: the plugin works exactly on the compiler internals, so is exactly seeing what the compiler sees and what the compiler works upon. Also, using a compiler plugin is less disruptive than some external tool: you basically just add a sequence of arguments to the CFLAGS or CXXFLAGS of your makefile. – Basile Starynkevitch Nov 16 '11 at 10:19
  • I have tried using libclang but i had to many problems I went through tutorials and other thing and could never get any of the code to compile http://stackoverflow.com/questions/7921676/documentation-for-clangs-libraries so I decided to use gcc/clang/ANTLR 3 ways one should work. – zeitue Nov 16 '11 at 18:03
2

Our DMS Software Reengineering Toolkit can parse C, C++, Java and Ada code (not Objective C at this time) in a wide variety of dialects and carry out transformations on the code. DMS's C and C++ front ends include a preprocessor, so you can you can cause preprocessing before you parse.

I'm probably don't understand what you want to do, because it seems strange to rename every function and (global?) variable with a "My...." prefix. But you could do that with some DMS rules (a rough sketch of renames of user functions for GCC3:

domain C~GCC3.

rule rewrite_function_names(t: type_designator, i: IDENTIFIER, p: parameter_list, s: statements):
      function_header->functionheader
"\t \i(\p) { \s } " -> "\t \renamed\(\i\) (\p) { \s }" ;

and a helper function "renames" that takes a tree node containing an identifer, and returns a tree node with the renamed identifier.

Because DMS patterns only match against the parse trees, you won't get any false positives.

You'd need some additional patterns to handle various different syntax cases within each langauge (e.g, for C, "void" return type, because "void" isn't a type designator in the syntax, and global variable declarations), and different rules for different languages (Ada's syntax is not the same as that of C).

This might seem like big hammer for your task, but if you really insist on doing this for a variety of languages in a reliable way, it seems hard to avoid the problem of getting decent parsers for all those languages. (And if you are really going to do this for all these languages, DMS can be taught to handle ObjectiveC the same we we have taught it to handle the other langauges).

Your alternative is some kind of string hacking solution, which might work 95% of the time. If you can live with that, then Perl or something similar is likely your answer.

Ira Baxter
  • 93,541
  • 22
  • 172
  • 341
  • Look like a good solution but it looks to be proprietary software. I don't have money right now to buy software and I also don't use windows other then that I would have used this. – zeitue Nov 16 '11 at 18:22