0

PostgreSQL has these monitoring tables, in particular pg_stat_statements, that contians this column called queryId and I was wondering how it is computed. If anybody knows where can I find the source code, that would be greatly appreciated.

So the pg_stat_statements tables would show something like:

userid | dbid | queryid |      query       | other statistics related columns 
1      |  2   | 123     | SELECT * FROM a; | ...

I am interested on how this 123 is being calculated.

Miguel Wang
  • 183
  • 1
  • 12

1 Answers1

0

This query ID is computed by the pg_stat_statements extension, see contrib/pg_stat_statements/pg_stat_statements.c

This comment explains how it works:

As of Postgres 9.2, this module normalizes query entries.  Normalization
is a process whereby similar queries, typically differing only in their
constants (though the exact rules are somewhat more subtle than that) are
recognized as equivalent, and are tracked as a single entry.  This is
particularly useful for non-prepared queries.

Normalization is implemented by fingerprinting queries, selectively
serializing those fields of each query tree's nodes that are judged to be
essential to the query.  This is referred to as a query jumble.  This is
distinct from a regular serialization in that various extraneous
information is ignored as irrelevant or not essential to the query, such
as the collations of Vars and, most notably, the values of constants.

This jumble is acquired at the end of parse analysis of each query, and
a 32-bit hash of it is stored into the query's Query.queryId field.
The server then copies this value around, making it available in plan
tree(s) generated from the query.  The executor can then use this value
to blame query costs on the proper queryId.
Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
  • Thanks for the answer Laurenz. I was wondering if you know where I can find the code to see this fingerprinting in action. Because I want to do something similar and I wanted to see if I can make use of the implemented version for Postgres – Miguel Wang Oct 04 '20 at 07:54
  • I pointed you to the file and function. You have to read the code for the exact algorithm. – Laurenz Albe Oct 05 '20 at 05:06