0

I have Citus extension on a PostgresSQL server. And I want to see the statistics from pg_stat_statements of each worker through the coordinator node. However, there is no column to match the tables from coordinator and workers. Does anybody know how can I do that?

I am also interested on how the queryId is being computed by PostgreSQL.

So the pg_stat_statements tables on the coordinator would show something like:

userid | dbid | queryid |      query       | other statistics related columns 
1      |  2   | 123     | SELECT * FROM a; | ...

While the pg_stat_statements tables on the worker would show something like:

userid | dbid | queryid |          query          | other statistics related columns 
1      |  2   | 456     | SELECT * FROM a_shard1; | ...
1      |  2   | 789     | SELECT * FROM a_shard2; | ...
Miguel Wang
  • 183
  • 1
  • 12

2 Answers2

0

You can match the table names on workers (shards) to the distributed tables on the coordinator with the help of pg_dist_partition, and pg_dist_shard_placement tables. For matching the stats, you can check citus_stat_statements view.

Ahmet Eren Başak
  • 493
  • 1
  • 4
  • 14
  • I do not have access to the citus_stat_statements view, so I would like to aggregate the information from coordinators and workers by myself. Do you know of any way that could be done? – Miguel Wang Oct 04 '20 at 07:58
0

(Cannot reply above answer so adding my answer here)
You can use below query to list location of shards of a specific table in a specific worker node (See last three filters in WHERE clause).

SELECT pg_dist_shard.shardid, pg_dist_node.nodename, pg_dist_node.nodeport
FROM pg_dist_shard, pg_dist_placement, pg_dist_node
WHERE pg_dist_placement.groupid = pg_dist_node.groupid AND
      logicalrelid = '<distributedTableName>'::regclass AND
      pg_dist_node.nodename = '<nodeName>' AND
      pg_dist_node.nodeport = '<nodePort>';

Then you can execute below query in worker node of your interest to see what Citus executes for a specific shard in that worker node:

SELECT * FROM pg_stat_statements WHERE query LIKE '%_<shardId>%';
onurctirtir
  • 200
  • 5
  • So this does not really solve the problem. First of all, this would be sort of equivalent to search each record on pg_stat_statements one by one. Second, even if after some work, the search on pg_stat_statements can be done through all shardIds together, this would only allow matching queries that use the same table. But if we have more complex queries that makes use of several tables there would be a cartesian-product number of possibilities and still cannot ensure 1-to-1 mapping. What I would like is to link wueries, not tables. Still, thanks a lot for the reply. – Miguel Wang Oct 06 '20 at 20:45