Questions tagged [google-cloud-dlp]

Questions related to the Google Cloud Data Loss Prevention API. Classification and de-identification of sensitive data like PII. Works on text and images.

132 questions
1
vote
1 answer

How to whitelist url with Cloud DLP?

I use Cloud DLP for identify sensitive data but I want allow (whitelist) some data so DLP not identify them. For example by default I want to match URL infoType so DLP identify it. But I also want to allow certain URL from google.com and yahoo.com.…
FlutterFirebase
  • 2,163
  • 6
  • 28
  • 60
1
vote
2 answers

Text dictionary like transformation in Google DLP

I would like the data to be masked, but it was possible to understand how many people studied at UNIVERSITY_1. What de-identification transformation can I use to accomplish such information\text masking? Input: { "students": [ { …
1
vote
1 answer

Using Google Cloud DLP Stored infotype and getting 400 Invalid built-in info type name

I have a stored infotype in the ready state on my DLP dashboard. Its name is Federal_Income_Tax. Using the python example at https://cloud.google.com/dlp/docs/concepts-infotypes I see the following: info_types = [{"name": info_type} for info_type…
1
vote
1 answer

Python GCP API not able to read environmental values

Developing a DLP client and Setting GOOGLE_APPLICATION_CREDENTIALS using win shell. API is failing with following signature: google.auth.exceptions.DefaultCredentialsError: File "XXXXX.json" was not found. When set in code…
1
vote
1 answer

Data Loss Prevention finds superfluous entities when masking email

I am calling the DLP API to mask person names and email addresses in text, using the following request: Request { "item": { "value": "Eleanor Rigby\nPharmacist\neleanor.rigby@example.com" }, "deidentifyConfig": { …
Marek Grzenkowicz
  • 17,024
  • 9
  • 81
  • 111
1
vote
2 answers

DLP data scan from bigquery table showing start byte as null

I have scanned a Bigquery table from Google DLP Console. The scan results are saved back into a big query table. DLP has identified sensitive information, but the start byte is shown as null, can anyone help me understand why? The source data…
Kuwali
  • 233
  • 3
  • 13
1
vote
1 answer

Exporting Google Drive/Docs files to Google Cloud Storage

We need to scan files with google dlp. However google dlp scanning is only supported in gcs. (https://cloud.google.com/blog/products/identity-security/take-charge-of-your-data-scan-for-sensitive-data-in-just-a-few-clicks) So I need to export the…
hans
  • 11
  • 1
1
vote
1 answer

De-identifying storages in Google Cloud DLP

I was using a dataflow streaming template for DLP deidentification from GCS to BQ. I wanted a batch solution. I found out cloud.google.com/dlp/docs/deidentify-storage, which provided a new "deidentify" action for the create_dlp_job function. When I…
1
vote
1 answer

Getting 403 Permission Denied with GCP DLP API

I am writing a Python script to check the content of some files existing in Google Cloud Storage, if the contain some PII. Script is as below dlp = google.cloud.dlp_v2.DlpServiceClient() url = "gs://{}/{}".format("my-bucket-name",…
Akash
  • 387
  • 1
  • 5
  • 19
1
vote
1 answer

Google Cloud DLP API: Default Secure Communication

We are using java library com.google.cloud:google-cloud-dlp to make Google Cloud DLP calls. On the client side, we are utilizing DlpServiceClient. I know for the fact that it internally uses gRPC. Wondering whether the default communication uses…
Sreedhar
  • 35
  • 4
1
vote
1 answer

Google Cloud DLP tag in Data Catalog shows as Job State as pending?

I have first created a custom template in DLP (with custom detectors) and then created a DLP job using the new DLP template against a BQ table and ran the job with publish to Data Catalog setting. The DLP job completed but the DLP tag in Data…
1
vote
1 answer

How to read parquet file from bucket (GCS) and de-identification to specific column using DLP api?

I following is my JSON Object for DLP API call to mask specific column of data on parquet file which is on a bucket on GCS. While calli dlp.deidentify_content() method i have to pass item to it, not sure how to pass parquet file, i have already…
1
vote
1 answer

Text limit for Google DLP

I can't seem to find the text size limit for de-identifying text https://cloud.google.com/dlp/limits said there's a 4KB limit for each quote. What does quote means ? Does it mean a string ?
Rahadian Kumang
  • 591
  • 6
  • 15
1
vote
1 answer

How to get a valid token to use GCP Data Loss Prevention API on a local machine were SDK is installed?

Right now I don't managed to have the Google Cloud Platform Data Loss Prevention (DLP) client library for python working behind a SSL proxy (it works fine with other GCP client lib for example for storage or…
1
vote
3 answers

How to get the location of the scanned file when using Google Cloud DLP API?

I'm scanning a nested directory in a cloud storage bucket. The result doesn't contain the matched value (quote) although I have the include_quote on. Also, how do I get the name of the files that have the matching along with the matched values? I'm…
Kiso
  • 13
  • 4
1 2
3
8 9