I am using PostgreSQL. After restore the live data into testing database, my aim is in testing data base restored data should be sanitized to remove sensitive information but still representative of current data distributions.
1 Answers
There is no one-size-fits-all framework for doing this. The best approach is to look carefully at specific test data, and what data you need to sanitize.
Typically sensitivity will come in a number of forms. These include:
Customer name information. In this case, I have used tools like http://random-name-generator.info/ to generate random names to put in place of actual names. You can also find random street address generators as well.
Confidential payment information (things like credit card or bank account info). In this case, usually I tend to find some way of creating a new arbitrary value I can map things over. The specifics depend on the data and what I am checking, but here I tend to write tools in whatever programming languages I tend to know.

- 25,424
- 6
- 65
- 182