0

Problem: Created User with role and row policies do not see data from all shards unless default user is granted the same row policies. Setup is the following: I have clickhouse cluster, 3 shards(2 replicas each shard).

 <access_management>1</access_management>

is turned on to enable user creation. Embedded click-house keeper is enabled.


CREATE USER admin_user GRANTEES ANY ON CLUSTER '{cluster}' IDENTIFIED WITH plaintext_password BY 'admin';
CREATE ROLE admin_role ON CLUSTER '{cluster}';
GRANT admin_role TO admin_user ON cluster '{cluster}';
GRANT ON CLUSTER '{cluster}' SELECT ON testdb.* TO admin_role; 

CREATE ROW POLICY IF NOT EXISTS  
  admin_rp_local_table ON CLUSTER '{cluster}' ON testdb.local_table
USING 1 TO admin_role;

CREATE ROW POLICY IF NOT EXISTS  
  admin_rp_distributed_table ON CLUSTER '{cluster}' ON testdb.distributed_table
USING 1 TO admin_role;

Is executed to create user,role and row policy.

After inserting data into distributed table and trying to select it from distributed table as admin user I get result from one shard(current).

I gave row policies to default user and admin user starts to see data from all shards.

CREATE ROW POLICY IF NOT EXISTS  
  default_rp_local_table ON CLUSTER '{cluster}' ON testdb.local_table
USING 1 to default;

CREATE ROW POLICY IF NOT EXISTS  
  default_rp_distributed_table ON CLUSTER '{cluster}' ON testdb.distributed_table
USING 1 to default;

Why granting row policies to default user affects shards visibility of admin user in a cluster? (Data from all shards will be in a result set)

Added example of the situation:

We create table 'local_table' with columns id, name. Created distributed_table from local_table. Insert following data via distributed_table: (1:Bob),(2:Fred),(3:Greg). First shard local table has (1:Bob). Second shard local table has (2:Fred). Third shard local table has (3:Greg).

Added admin_role, added admin_user, granted select for admin_user. Now default and admin_user see data - (1:Bob),(2:Fred),(3:Greg) when selected from distributed table.

If we add row policy to admin_user now and make select from distributed table: default sees no data(expected, local and distributed) and admin sees only - (1:Bob) (data from local table but should see data from all shards).

if now we add the same row policy but for DEFAULT user: default see (1:Bob),(2:Fred),(3:Greg) admin see (1:Bob),(2:Fred),(3:Greg). Why so? is it a bug?

2 Answers2

0

https://clickhouse.com/docs/en/sql-reference/statements/create/row-policy

note

If there are no row policies defined for a table then any user can SELECT all the row from the table. Defining one or more row policies for the table makes the access to the table depending on the row policies no matter if those row policies are defined for the current user or not. For example, the following policy

CREATE ROW POLICY pol1 ON mydb.table1 USING b=1 TO mira, peter

forbids the users mira and peter to see the rows with b != 1, and any non-mentioned user (e.g., the user paul) will see no rows from mydb.table1 at all.

If that's not desirable it can't be fixed by adding one more row policy, like the following:

CREATE ROW POLICY pol2 ON mydb.table1 USING 1 TO ALL EXCEPT mira, peter

You are in the public library. You can read any book except the book from the special locked bookshelf. The only John has a key to this bookshelf, automatically all other users don't have access including a librarian.

Denny Crane
  • 11,574
  • 2
  • 19
  • 30
  • I have read documentation and this snipped does not correspond to the situation I described. I added an example of situation. Could you please have a look once more when you have the time to do so? – Maik Jevdokimov Jun 15 '23 at 14:18
0

Embedded click-house keeper is enabled. This was the root of the problem. In cluster setup Clickhouse keeper uses credentials to connect to other shards(when we make query via distributed table) in order to get the data. So when we make a query from distributed table the query to local shard is executed as a current user(the one we used to connect to database), then keeper is using default user to query other shards. That is why in my example after setting row policies for admin user and making a query as that user only first(local) shard data is shown. And that is why if we set row policies for default user the query for admin user works correctly. Creating a new user with all 'permissions' only for keeper may be a solution for this situation. Moral of the story - I should read the documentation of the tools I am using more carefully).

user | username that will be used to authenticate to the cluster instances |default

password | password for the user define to allow connections to cluster instances | ClickHouse123!

https://clickhouse.com/docs/en/guides/sre/keeper/clickhouse-keeper