0

I have a raw table which has a variant column of json data. There are some normal views (not materialised view) and tables are created using the json events from the raw table.

After applied a masking policy using UDF on the variant column of the raw table when the role is bi_analyst, there are two issues I found:

  1. The tables derived from the underlying table are not masked with bi_analyst role;
  2. The views derived using the underlying table become empty with bi_analyst role;

Does anyone know why this is happened? dose this dynamic masking feature not support views on underlying table?

What I would like to do is masking the underlying data and all the tables and views coming from it are also masked with the specified role. It is easy to deal with those tables, since I can just apply the masking policy on them as well.

However, I have no idea about the views. How can I still access the views with the role, which should able to see the data but not the sensitive columns?

The UDF is:

-- JavaScript UDF to mask pii data --
use role ACCOUNTADMIN;

CREATE OR REPLACE FUNCTION full_address_masking(V variant)
  RETURNS variant
  LANGUAGE JAVASCRIPT
  AS
  $$
    if ("detail" in V) {
        if ("latitude" in V.detail) {
            V.detail.latitude = "******";
        }
        if ("longitude" in V.detail) {
            V.detail.longitude = "******";
        }
        if ("customerAddress" in V.detail) {
            V.detail.customerAddress = "******";
        }
    }

    return V;
  $$;

The Masking policy is:

-- Create a masking policy using JavaScript UDF --
create or replace masking policy json_address_mask as (val variant) returns variant ->
    CASE
      WHEN current_role() IN ('ACCOUNTADMIN') THEN val
      WHEN current_role() IN ('BI_ANALYST') THEN full_address_masking(val)
    ELSE full_address_masking(val)
    END;

The sql command to set masking policy on raw data is:

-- Set masking policy --
use role ACCOUNTADMIN;
alter table DB.PUBLIC.RAW_DATA 
    modify column EVERYTHING 
    set masking policy json_address_mask;

The masking policy is applied on the variant column EVERYTHING, which data structure looks like:

  {
      "detail": {
        "customAddress": "******",
        "id": 1,
        "latitude": "******",
        "longitude": "******"
      },
      "source": "AAA"
    }

A derivative table is:

create or replace table DB.SCHEMA_A.TABLE_A
as 
select * from DB.PUBLIC.RAW_DATA 
where everything:source='AAA';

grant select on table DB.schema_A.table_A to role bi_analyst;

A view is:

create or replace  view DB.SCHEMA_A.VIEW_A  as ( 
select
everything:account::string as account,
everything:detail:latitude::float as detail_latitude,
everything:detail:longitude::float as detail_longitude,
from
DB.PUBLIC.RAW_DATA
where
everything:source::string = 'AAA'

grant select on view DB.SCHEMA_A.VIEW_A to role bi_analyst;

The result is that RAW_DATA is masked, TABLE_A is not masked at all, VIEW_A gets 0 rows returned when querying data with BI_ANALYST role.

chenchenmomo
  • 233
  • 1
  • 3
  • 16
  • It would help if you share your table structure, view definition and Data Masking UDF – demircioglu Aug 24 '20 at 17:21
  • What UDF are you using to apply a masking policy? If a masking policy is applied to a base table correctly, all views that reference that table will also be masked. – Mike Walton Aug 24 '20 at 17:50
  • @demircioglu hello, I updated the question with code and more data details, hope it help to know my issue – chenchenmomo Aug 24 '20 at 20:51
  • @MikeWalton Hi, I just updated the question with more details. Yeah I was expecting all the views are masked too but really cannot figure out why it doesn't work – chenchenmomo Aug 24 '20 at 20:54

2 Answers2

0

#1 - When you create a table from a table that has masked data, you're going to get the data that the role creating the new table has access to in the masked table. So, in your example, TABLE_A has unmasked data, because it was created by a role that has access to it. The masking policy does not automatically get applied to the new table.

#2 - As for #2, I believe your only issue is that the JSON in your example isn't correctly formed, which is why you are getting NULL values. When I fixed this json to the following, it works fine using the same function and masking policy that you've posted:

{
"detail":{
"latitude": 132034034.00,
"longitude": 12393438583732,
"id": 1,
"customAddress" : "XXX Road, XXX city, UK"
},
"source": "AAA"
}

Masked Result:

{
  "detail": {
    "customAddress": "XXX Road, XXX city, UK",
    "id": 1,
    "latitude": "******",
    "longitude": "******"
  },
  "source": "AAA"
}
Mike Walton
  • 6,595
  • 2
  • 11
  • 22
  • Sorry the json in my post is not formatted correctly, I've correct it now! I can confirm it is correctly-formatted in snowflak – chenchenmomo Aug 25 '20 at 08:29
  • It makes sense that the table is not masked with your explanation! but for the views it is not the case because the actual json are formatted correctly in the actual dataset – chenchenmomo Aug 25 '20 at 08:35
0

The issue of the tables not been masked is explained well by @Mike in his answer. The solution can be just to create the derivative tables using a role which is restricted by the masking policy.

The issue of the views is about the type of masked value "******", which is a string type, while the actual type of fields latitude and longitude are float. When creating the view, I still cast the latitude and longitude fields to float type:

create or replace  view DB.SCHEMA_A.VIEW_A  as ( 
select
everything:account::string as account,
everything:detail:latitude::float as detail_latitude,
everything:detail:longitude::float as detail_longitude,
from
DB.PUBLIC.RAW_DATA
where
everything:source::string = 'AAA'

There is a hidden error of casting "******" to float but snowflake still go ahead and create the view. But when I query the data with the BI_ANALYST role, it returns 0 row.

So the workaround is casting those fields to variant type:

create or replace  view DB.SCHEMA_A.VIEW_A  as ( 
select
everything:account::string as account,
everything:detail:latitude::variant as detail_latitude,
everything:detail:longitude::variant as detail_longitude,
from
DB.PUBLIC.RAW_DATA
where
everything:source::string = 'AAA'

Which is not perfect because it completely changed the definition of the view, none of roles can get the actual float/number type of the data, even including accountadmin

chenchenmomo
  • 233
  • 1
  • 3
  • 16
  • This is a good catch on the data types being the issue. If the OP wants to maintain a float on the view, they could also just plug-in `0.0` as the mask value...or `NULL` and that would work, as well, without the need to change the data type to a variant. – Mike Walton Aug 25 '20 at 13:18