1

I am running dlp job inspections from google cloud storage and i was wondering if there is a method or way to get the full inspection results instead of the summary just the same way as inspecting external files? Here is a code snippet of how i am getting my inspection results when scanning external and local files:

# Print out the results.
    results = []
    if response.result.findings:
        for finding in response.result.findings:
            finding_dict = {
                "quote": finding.quote if "quote" in finding else None,
                "info_type": finding.info_type.name,
                "likelihood": finding.likelihood.name,
                "location_start": finding.location.byte_range.start,
                "location_end": finding.location.byte_range.end
            }
            results.append(finding_dict)
    else:
        print("No findings.")

The output looks like this:

{
    "quote": "gitlab.com",
     "info_type": "DOMAIN_NAME",
     "likelihood": "LIKELY",
     "location_start": 3015,
     "location_end": 3025
},
   {
     "quote": "www.makeareadme.com",
     "info_type": "DOMAIN_NAME",
     "likelihood": "LIKELY",
     "location_start": 3107,
     "location_end": 3126
    }

But when scanning google cloud storage items using the dlp_get_job method with pub/sub this way:

    def callback(message):
        try:
            if message.attributes["DlpJobName"] == operation.name:
                # This is the message we're looking for, so acknowledge it.
                message.ack()

                # Now that the job is done, fetch the results and print them.
                job = dlp_client.get_dlp_job(request={"name": operation.name})
                if job.inspect_details.result.info_type_stats:
                    for finding in job.inspect_details.result.info_type_stats:
                        print(
                            "Info type: {}; Count: {}".format(
                                finding.info_type.name, finding.count
                            )
                        )
                else:
                    print("No findings.")

                # Signal to the main thread that we can exit.
                job_done.set()
            else:
                # This is not the message we're looking for.
                message.drop()
        except Exception as e:
            # Because this is executing in a thread, an exception won't be
            # noted unless we print it manually.
            print(e)
            raise

The results are in this summary format:

Info type: LOCATION; Count: 18
Info type: DATE; Count: 12
Info type: LAST_NAME; Count: 4
Info type: DOMAIN_NAME; Count: 170
Info type: URL; Count: 20
Info type: FIRST_NAME; Count: 7

is there a way to get the detailed inspection results when scanning files on google cloud storage where i will get the quote, info_type, likelihood etc...without being summarized? I have tried a couple of methods and read through almost the docs but i am not finding anything that can help. I am running the inspection job on a windows environment with the dlp python client api. I would appreciate anyone's help with this;)

1 Answers1

2

Yes you can do this. Since the detailed inspection results can be sensitive, those are not kept in the job details/summary, but you can configure a job "action" to write the detailed results to a BigQuery table that you own/control. This way you can get access to the details of every finding (file or table path, column name, byte offset, optional quote, etc.).

The API details for that are here: https://cloud.google.com/dlp/docs/reference/rest/v2/Action#SaveFindings

Below are some more docs on how to query the detailed findings:

Also more details on DLP Job Actions: https://cloud.google.com/dlp/docs/concepts-actions

Scott Ellis
  • 116
  • 3
  • Thank you @scott this worked out perfect for me. – Elisius Legodi Jun 10 '22 at 09:20
  • @Scott I'm also trying to achieve the same but facing challenges in configuring in Action to save the findings in BigQuery table . Error - Job failed due to: Protocol message Action has no "saveFindings" field. It would be appreciated , if there are any references to achieve this. – Deena Dhayal Oct 19 '22 at 08:53
  • @ElisiusLegodi please share if there are any references – Deena Dhayal Oct 19 '22 at 08:53
  • Are you able to share your example request? You should be able to add an action like the example specified here: https://cloud.google.com/dlp/docs/analyzing-and-reporting – Scott Ellis Oct 20 '22 at 15:00