I have a project that involves free text user input (strings of less than 80 characters) and I need to detect PII within that string. This all needs to happen in real-time as we need to send a response to the user input (within 2 seconds or so) which is partially based on whether or not PII is in the text.
I already have found some solutions but they are not quite what I'm looking for:
- Google DLP - requests take over two seconds to process string so cannot be used.
- redact-pii (npm module) - too simplistic in their detections
- AWS Macie - runs on existing datastores and not in-flight data.
Do you have any suggestions for services or libraries that can help with this?
Specific PII we want to detect involves things such as name, address, phone number. Also SPII such as credit card number, social insurance number. Essentially we want to be compliant, in our handling of free-text, with standards such as PIPEDA and GDPR.