0

I have a script I've been testing out Google's DLP with that has suddenly stopped working:

def redact_text(text_list):
    service = build_client()
    content = service.content()
    items = []
    for text in text_list:
        items.append({
          "type": "text/plain",
          "value": text
        })

    data = { "items":items,
        "inspectConfig": {
            "minLikelihood": "LIKELIHOOD_UNSPECIFIED",
            "maxFindings": 0,
            "includeQuote": False
        },
        "replaceConfigs": [{'replaceWith': redacts[info], 'infoType':{'name': info}} for info in redacts.keys()]
    }
    request = content.redact(body=data)
    response = request.execute()
    return response


def build_client():
    scopes = ['https://www.googleapis.com/auth/cloud-platform']
    credentials = ServiceAccountCredentials.from_json_keyfile_name('cred_name.json', scopes=scopes)
    http_auth = credentials.authorize(Http())

    service = build('dlp', 'v2beta1', http=http_auth)
    return service


if __name__ == '__main__':
    test_list = []
    test_text = "My name is Jenny and my number is (555) 867-5309, you can also email me at notarealemail@fakegmail.com another email you can reach me at is email@email.com.  "
    task = "inspect"
    test_list.append(test_text)
    test_list.append("bill (555) 202-4578, that one place down the road some_email@notyahoo.com")
    print(test_list)
    result = redact_text(test_list)
    print(result)

Normally I'll get back a response, but today I've been getting back:

Traceback (most recent call last):
  File "test_dlp.py", line 82, in <module>
    response = redact_text(text_list)
  File "test_dlp.py", line 42, in redact_text
    content = service.content()
AttributeError: 'Resource' object has no attribute 'content'

I've made no changes to this script, and it has worked previously.

CBredlow
  • 2,790
  • 2
  • 28
  • 47

1 Answers1

3

On May 1st the beta apis were deprecated as the GA versions are now available.

For the most part your script will work fine, but switch to version 'v2'.

The second part of your script that will need to change is that ContentItem "aka items" is just a single item now, not a list.

You can either send through individual requests using

item = {'value': content_string}

or if you still wish to batch this into a single request, you can use a table. (i didn't run this so excuse compile typos, but the api surface is this.)

var rows = []
var headers = [{"name": "column1"}]
for text in text_list:
  rows.append({
    "string_value": text,
  })

item = {
  "table": {
    "headers": headers,
    "rows": rows
  }
}
 data = { "item":item,
          "inspectConfig": {
            "minLikelihood": "LIKELIHOOD_UNSPECIFIED",
            "maxFindings": 0,
            "includeQuote": False
        },
        "replaceConfigs": [{'replaceWith': redacts[info], 'infoType':{'name': info}} for info in redacts.keys()]
    }
Jordanna Chord
  • 950
  • 5
  • 12
  • This was the fix. But I do have a question of what 'headers' should contain. Another fix was my credentials were having issues as well. – CBredlow May 09 '18 at 20:41
  • Headers are column names. They can be used as hints to our system as what might be in a column, and are also used to tell you which column the finding is in. – Jordanna Chord May 11 '18 at 00:21