1

I was able to create a data masking policy for a json column for the top level keys with the following, but couldn't figure out to go to deeper layers in json. Anybody has done that?

CREATE OR REPLACE MASKING POLICY json_mask_test AS
(val variant) returns variant ->
CASE
WHEN invoker_role()='ADMIN' THEN val
ELSE object_insert(
     object_insert(
     object_insert(val, 'pii_field', '***', true),
         'address','***', true),
         'lastName','***', true)
END

If object_insert is the only way to create a masking policy on a json field, looks like it's limited to top level keys.

I was using the example for On Variant Data

Also as a side effect this policy inserts the keys into the json fields when the keys don't exist in original field. To be able to eliminate this would be desirable.

Edit:

I have used this json for the example above {"regular_field": "regular data", "pii_field": "pii data"}

I was trying to mask LastNames in a json like the following

'{"root":[{"employees":[
   {"firstName":"John", "lastName":"Doe"},
   {"firstName":"Anna", "lastName":"Smith"},
   {"firstName":"Peter", "lastName":"Jones"}
]}]}'
demircioglu
  • 3,069
  • 1
  • 15
  • 22
  • What is the form of your JSON in this example? – Mike Walton Jul 21 '20 at 13:26
  • @MikeWaltonI have added the two json strings I've used for this test above. – demircioglu Jul 21 '20 at 16:23
  • I think the best way to accomplish this is to create a UDF that is capable of flattening, parsing, replacing, and rebuilding your VARIANT field, and then making reference to that UDF in your CASE statement in your data masking. – Mike Walton Jul 22 '20 at 16:14
  • I have json fields with multiple layers up to 1500 keys in them, this is not going to be feasible I won't be able to create a masking policy unless Snowflake comes up with a better way – demircioglu Jul 22 '20 at 18:58
  • Do you flatten that data out within Snowflake or just leave it as json and allow users to query it directly? – Mike Walton Jul 23 '20 at 13:44
  • I leave them in json and create views to flatten select columns for data consumption, I think I will create the masking policy on the views in this case. – demircioglu Jul 23 '20 at 16:12
  • Yeah, that was going to be my suggestion. :-) – Mike Walton Jul 23 '20 at 17:57
  • As a note, if you use secure views in your flatten, and don't provide access to the underlying tables, you don't need a masking policy. You can just put the logic directly into your secure view. – Mike Walton Jul 23 '20 at 17:58

2 Answers2

0

In the case above we create views to flatten json fields for consumption.

To meet the requirement, the masking policy can be applied to the view.

View definition :

CREATE OR REPLACE VIEW v_json_mask AS 
SELECT n.value:firstName::string firstName, n.value:lastName::string lastName
FROM json_test,
lateral flatten (input => v:root) r,
lateral flatten (input => r.value) e,
lateral flatten (input => e.value) n

I have created another masking policy for last name masking and applied to the view

CREATE OR REPLACE MASKING POLICY lastName_mask AS
(lastName text) returns text ->
CASE
WHEN invoker_role()='ADMIN' THEN lastName
ELSE '*masked*'
END;

ALTER VIEW v_json_mask MODIFY COLUMN lastName SET MASKING POLICY lastName_mask;

Running SELECT * FROM v_json_mask with any role not specified in masking policy returns

FIRSTNAME   LASTNAME
John        *masked*
Anna        *masked*
Peter       *masked*
demircioglu
  • 3,069
  • 1
  • 15
  • 22
0

This also works and was the recommended practice for masking data before Snowflake created data masking policies. Just use RBAC to control access to the JSON table and use a Secure View.

CREATE OR REPLACE SECURE VIEW v_json_mask AS 
SELECT n.value:firstName::string firstName, 
       CASE WHEN current_role()='ADMIN' THEN n.value:lastName::string
            ELSE '*masked*'
       END as lastName
FROM json_test,
lateral flatten (input => v:root) r,
lateral flatten (input => r.value) e,
lateral flatten (input => e.value) n
Mike Walton
  • 6,595
  • 2
  • 11
  • 22
  • Yes this would work for the use case above. Thanks. But I have dozens of tables with PII columns, I would like to create a central location for data masking policies and apply them to multiple tables and views. – demircioglu Jul 24 '20 at 13:16