I am writing a Python script to check the content of some files existing in Google Cloud Storage, if the contain some PII. Script is as below
dlp = google.cloud.dlp_v2.DlpServiceClient()
url = "gs://{}/{}".format("my-bucket-name", "my_file_name")
storage_config = {"cloud_storage_options": {"file_set": {"url": url}}}
parent = dlp.project_path("my-project-name")
inspect_job = {
"inspect_config": inspect_config,
"storage_config": storage_config
}
operation = dlp.create_dlp_job(parent, inspect_job=inspect_job)
job_done = threading.Event()
job = dlp.get_dlp_job(operation.name)
try:
if job.inspect_details.result.info_type_stats:
for finding in job.inspect_details.result.info_type_stats:
print("Info type: {}; Count: {}".format(finding.info_type.name, finding.count))
else:
print("No findings.")
job_done.set()
except Exception as e:
print(e)
raise
finished = job_done.wait(timeout=3000)
if not finished:
print(
"No event received before the timeout. Please verify that the "
"subscription provided is subscribed to the topic provided."
)
I read in the documentation that DLP API created a service account of its own with the required set of permission.
When the Cloud DLP is enabled, a service account is added to the project.
To access both Google Cloud resources and execute calls to Cloud DLP by means of a JobTrigger, Cloud DLP uses the credentials of the Google APIs service account to authenticate to other APIs. The Google APIs service account is designed specifically to run internal Google processes on your behalf. The service account is identifiable using the email:
service-[PROJECT_NUMBER]@dlp-api.iam.gserviceaccount.com
When I run the code, I get a 403 error stating it does not have required permission dlp.jobs.create. I updated IAM policy for the account to contain a custom role with the below-mentioned policy set(since this is only a project for learning purpose).
dlp.analyzeRiskTemplates.create
dlp.analyzeRiskTemplates.delete
dlp.analyzeRiskTemplates.get
dlp.analyzeRiskTemplates.list
dlp.analyzeRiskTemplates.update
dlp.deidentifyTemplates.create
dlp.deidentifyTemplates.delete
dlp.deidentifyTemplates.get
dlp.deidentifyTemplates.list
dlp.deidentifyTemplates.update
dlp.inspectTemplates.create
dlp.inspectTemplates.delete
dlp.inspectTemplates.get
dlp.inspectTemplates.list
dlp.inspectTemplates.update
dlp.jobTriggers.create
dlp.jobTriggers.delete
dlp.jobTriggers.get
dlp.jobTriggers.list
dlp.jobTriggers.update
dlp.jobs.cancel
dlp.jobs.create
dlp.jobs.delete
dlp.jobs.get
dlp.jobs.list
dlp.kms.encrypt
dlp.storedInfoTypes.create
dlp.storedInfoTypes.delete
dlp.storedInfoTypes.get
dlp.storedInfoTypes.list
dlp.storedInfoTypes.update
serviceusage.services.use
My Service account has two separate permission sets:
- DLP permissions separately:
- dlp.jobs.create
- dlp.jobs.cancel
- dlp.jobs.delete
- dlp.jobs.get
- dlp.jobs.list
Owner permission so it has unrestricted access to all google resources.
- roles/owner
However, when I run the script now, It still gives the following error:
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.PERMISSION_DENIED
details = "Not allowed, access denied for permission dlp.jobs.create."
debug_error_string = "{"created":"@1581682593.219000000","description":"Error received from peer ipv4:xxx.xxx.x.x","file":"src/core/lib/surface/call.cc","file_line":1056,"grpc_message":"Not allowed, access denied for permission dlp.jobs.create.","grpc_status":7}"
google.api_core.exceptions.PermissionDenied: 403 Not allowed, access denied for permission dlp.jobs.create.