2

I begin to get acquainted with the implementation of algorithms of code-generation and optimizations in gcc and llvm. Can anyone give an advice on where to see materials, articles, lectures about how it arranged in these compilers? I was trying to find something where described in fairly simple language such things as optimization and code generation algorithm's implementation or simply detailed explanation, but I didn't find. Maybe there is a exhaustive guide where I'll be able to find information about exact classes and methods which are called, in what files are these algorithms written, basic structures with which they operate (symbol tables and their entries, graphs, AST, struct tree and rtl in gcc etc). I'm familiar with Steven Muchnick's "Advanced compiler design and implementation", but it's quite complicated to find something similar in source codes of gcc and llvm to algorithms in ICAN notation without some useful information.

Summary:

My goal is to get acquainted with the implementation of optimization algorithms and code generation on the example of gcc and llvm. So I would like to find materials that somehow simplify reading of source code of gcc or llvm. I hope that these materials exist.

  • What is your concrete goal? Why are you asking? What kind of improvements or experimentations do you want to do with GCC or LLVM? You should **edit your question** to improve it (and give additional context and motivation on your work) – Basile Starynkevitch Nov 12 '17 at 09:55
  • Changed question –  Nov 12 '17 at 10:12
  • Then my answer fits even more. I've wrote many slides on that. But your question is still a bit off-topic here. – Basile Starynkevitch Nov 12 '17 at 10:29
  • My question was exactly about this materials cos it was complicated for me to find something helpful. –  Nov 12 '17 at 10:57
  • My answer should be helpful, even if the documentation I wrote or collected is incomplete and slightly out of date. You won't find any exhaustive and up to date documention, since the source code is the authoritative material. BTW, you'll need a lot of work (several months to have a small idea about GCC, several years to be confident and understand that you'll never learn all about it). – Basile Starynkevitch Nov 12 '17 at 10:59
  • Thanks! Hope this will help! –  Nov 12 '17 at 11:00

1 Answers1

1

Your question is off-topic here (since about finding resources and books).

However, for GCC, I did collect several references and wrote hundreds of slides, see the documentation page of GCC MELT (and many web pages pointed from it).

For LLVM, you need to find equivalent documentation (there are lot of them too).

GCC MELT is now -in November 2017- an inactive project (so my slides cover older GCC versions). I could be funded to work on something similar.

Maybe there is a exhaustive guide

You won't find anything exhaustive and up to date because both GCC and Clang are evolving significantly and continuously. The most exhaustive is still the source code (of millions of lines, growing by a few percents each year), and the community behind it. You'll need several years of work (full-time) to comprehend these monster free software projects, and you should also follow their evolution.

Once you have spent several weeks reading about GCC and looking inside the source code, you can ask some precise questions on gcc@gcc.gnu.org. If you experiment some GCC plugin or work on your own fork of GCC, be sure to make it free software and publish now your alpha-quality -even buggy and incomplete- source code somewhere -perhaps on github- before asking, under a GPL license.

BTW, real-life compilers are much more complex than what is taught in textbooks, even as good as the Dragon Book. Nobody can understand GCC (or LLVM) completely (it is too complex for a single brain, and is evolving too fast) - and that also holds for any multi-million lines software project.

So I would like to find materials that somehow simplify reading of source code of gcc or llvm

Most of what I have written on GCC MELT (notably the slides that are not MELT specific, and all the references I have collected) fits that goal. However, the autoritative material is the -continuously changing- source code of GCC.

NB: My gcc-melt.org domain will be lost in April 2018 (and I probably won't renew it). So look on http://starynkevitch.net/Basile/gcc-melt which should be kept longer.

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547