1

As per (Link), it's possible to mask sensitive data by partially o fully replacing characteres with a symbol (De-identifying sensitive data) using the DLP API in GCP. I didn't find any glue to customize the transformation rule in the request, for example, Let's say we need to transform the 16-digit account number, where once the value has been detected, the first 6 digits "and" the last 4 digits will be left intact while the rest of the digits will be replaced by "*" (123456******3456), and any such combination, however, the configuration seems to only allow the transformation of the first "or" last digits of the field.

{
  "deidentifyConfig": {
    "recordTransformations": {
      "fieldTransformations": [
        {
          "fields": [
            {
              "name": "NUMBER_ACCOUNT"
            }
          ],
          "primitiveTransformation": {
            "characterMaskConfig": {
              "maskingCharacter": "#",
              "numberToMask": -6
            }
          }
        }
      ]
    }
  }

Result of the code above:

"stringValue": "#########123456"

The tag numberToMask allow to set the number of characters to mask, and, in combination with reverseOrder we can obscure just first o last digits, but, what about both?

is it possible to use REGEX or tranformation rule to create a custom deidentifyConfig or what should be the approach to inspect (detect) a specifict sensitive data and apply any custom masking rule using DLP?

For example, how to get this masked values:

12345678****3456
12345678******56

Note. Dynamic Data Masking in BigQuery is not an option here, since in there does't exist a way to create a custom masking rule yet

Lais T
  • 17
  • 4

2 Answers2

1

That ability is not currently supported, but I'll record it as a feature request for the team.

Jordanna Chord
  • 950
  • 5
  • 12
  • Thanks @Jordanna Chord, In the meantime, Somebody have any workaround to address this issue? – Lais T Aug 30 '23 at 16:12
1

One workaround is to define a custom infoType with a regex that matches your account number, and provide a matching group, like this:

  "inspectConfig": {
    "customInfoTypes": [
      {
        "infoType": {
          "name": "NUMBER_ACCOUNT_TYPE"
        },
        "likelihood": "LIKELY",
        "regex": {
          "pattern": "\\d{8}(\\d{4})\\d{4}",
          "groupIndexes": [
            1
          ]
        }
      }
    ]
  },

Then use infoTypeTransformations to mask your custom infoType finding:

  "deidentifyConfig": {
    "recordTransformations": {
      "fieldTransformations": [
        {
          "fields": [
            {
              "name": "NUMBER_ACCOUNT"
            }
          ],
          "infoTypeTransformations": {
            "transformations": [
              {
                "infoTypes": [
                  {
                    "name": "NUMBER_ACCOUNT_TYPE"
                  }
                ],
                "primitiveTransformation": {
                  "characterMaskConfig": {
                    "maskingCharacter": "#"
                  }
                }
              }
            ]
          }
        }
      ]
    }
  }

Example request using the REST API: https://cloud.google.com/dlp/docs/reference/rest/v2/projects.locations.content/deidentify?apix=true&apix_params=%7B%22parent%22%3A%22projects%2Fproject-id%2Flocations%2Fglobal%22%2C%22resource%22%3A%7B%22item%22%3A%7B%22table%22%3A%7B%22headers%22%3A%5B%7B%22name%22%3A%22NUMBER_ACCOUNT%22%7D%2C%7B%22name%22%3A%22NUMBER_OTHER%22%7D%5D%2C%22rows%22%3A%5B%7B%22values%22%3A%5B%7B%22stringValue%22%3A%221234567890123456%22%7D%2C%7B%22stringValue%22%3A%221234567890123456%22%7D%5D%7D%5D%7D%7D%2C%22inspectConfig%22%3A%7B%22customInfoTypes%22%3A%5B%7B%22infoType%22%3A%7B%22name%22%3A%22NUMBER_ACCOUNT_TYPE%22%7D%2C%22likelihood%22%3A%22LIKELY%22%2C%22regex%22%3A%7B%22pattern%22%3A%22%5C%5Cd%7B8%7D(%5C%5Cd%7B4%7D)%5C%5Cd%7B4%7D%22%2C%22groupIndexes%22%3A%5B1%5D%7D%7D%5D%7D%2C%22deidentifyConfig%22%3A%7B%22recordTransformations%22%3A%7B%22fieldTransformations%22%3A%5B%7B%22fields%22%3A%5B%7B%22name%22%3A%22NUMBER_ACCOUNT%22%7D%5D%2C%22infoTypeTransformations%22%3A%7B%22transformations%22%3A%5B%7B%22infoTypes%22%3A%5B%7B%22name%22%3A%22NUMBER_ACCOUNT_TYPE%22%7D%5D%2C%22primitiveTransformation%22%3A%7B%22characterMaskConfig%22%3A%7B%22maskingCharacter%22%3A%22%23%22%7D%7D%7D%5D%7D%7D%5D%7D%7D%7D%7D

Mike DaCosta
  • 1,340
  • 1
  • 10
  • 10
  • Thanks a lot @Mike DaCosta, It works fine!!, With `groupsIndexes` is possible to mask diferent parts of the string as needed, I'd think this is not a workaround... @Jordanna Chord – Lais T Sep 01 '23 at 17:27