0

PostgreSQL makes use of intraoperation parallelism and that is of interest to me (for my undergad final year research project). I would like to know how operations like selection, projection, join, etc are parallelized, but when I tried to look at the source code, I got extremely overwhelmed. Is there a high-level PostgreSQL "map"?

I tried looking for books that discuss and explore the algorithms and implementations used in PostgreSQL, but unfortunately didn't find any. Though feel free to refer me to such a book if you know about one.

If the only option I have is to dig into the source code, how long would it take me to find the information I want? And if any of you have gone through the source code, what advise would you give to me?

1 Answers1

2

The nice thing about open source is that there is no clear border between the source code and the documentation, since both are public. As soon as you get deeper into the implementation details, you will start reading the code. Fortunately the PostgreSQL code is well written and quite readable.

The first stop on your way into the source are the README files. These describe implementation principles, algorithms and code rules at a higher level. In your case, you should start with src/backend/access/transam/README.parallel.

Another good approach it to read the patches that introduced the feature, like 924bcf4f16d, 7aea8e4f2daa, d1b7c1ffe72, f0661c4e8c44 and 80558c1f5aa1. That introduces you to the places in the code that are concerned with parallel query and gives you an idea how it all works.

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263