2

I'm attempting to performance-test distributed joins on Citus 5.0. I have a master and two worker nodes, and a few hash distributed tables that behave as expected with the default config. I need to use the task tracker executor to test queries that require repartitioning.

However, After setting citus.task_executor_type to task-tracker, all queries involving distributed tables fail. For example:

postgres=# SET citus.task_executor_type TO "task-tracker";
SET
postgres=# SELECT 1 FROM distrib_mcuser_car LIMIT 1;

ERROR:  failed to execute job 39
DETAIL:  Too many task tracker failures

Setting citus.task_executor_type in postgresql.conf has the same effect.

Is there some other configuration change I'm missing that's necessary to switch the task executor?

EDIT, more info:

  • PostGIS is installed on all nodes
  • postgres_fdw is installed on the master
  • All other configuration is pristine

All of the tables so far were distributed like:

SELECT master_create_distributed_table('table_name', 'id', 'hash');
SELECT master_create_worker_shards('table_name', 8, 2);

The schema for distrib_mcuser_car is fairly large, so here's a more simple example:

postgres=# \d+ distrib_test_int
                   Table "public.distrib_test_int"
 Column |  Type   | Modifiers | Storage | Stats target | Description
--------+---------+-----------+---------+--------------+-------------
 num    | integer |           | plain   |              |

postgres=# select * from distrib_test_int;
ERROR:  failed to execute job 76
DETAIL:  Too many task tracker failures
jasonmp85
  • 6,749
  • 2
  • 25
  • 41
  • Normally, just setting `task_executor_type` to `task-tracker` should be enough. Could you provide some more information? Which partition method did you use? How many shards do you have? What is the schema of `distrib_mcuser_car`? Do you use any other plugins? Did you do any other configuration changes? – Ahmet Eren Başak Apr 29 '16 at 07:44
  • @AhmetErenBaşak I've updated the question with more information. Thanks for your help! – Franklin Dingemans Apr 29 '16 at 20:14

2 Answers2

3

The task-tracker executor assigns tasks (queries on shards) to a background worker running on the worker node, which connects to localhost to run the task. If your superuser requires a password when connecting to localhost, then the background worker will be unable to connect. This can be resolved by adding a .pgpass file on the worker nodes for connecting to localhost.

jasonmp85
  • 6,749
  • 2
  • 25
  • 41
Marco Slot
  • 401
  • 3
  • 5
2

You can modify authentication settings and let workers connect to master without password checks by changing pg_hba.conf.

Add following line to master pg_conf.hba:

host    all             all             [worker 1 ip]/32            trust
host    all             all             [worker 2 ip]/32            trust

And following lines to for each worker-1 pg_hba.conf:

host    all             all             [master ip]/32              trust
host    all             all             [worker 2 ip]/32            trust

And following to worker-2 pg_hba.conf:

host    all             all             [master ip]/32              trust
host    all             all             [worker 1 ip]/32            trust

This is only intended for testing, DO NOT USE this for production system without taking necessary security precautions.

techraf
  • 64,883
  • 27
  • 193
  • 198
Murat Tuncer
  • 126
  • 3