I have Python code to generate SQL queries from English queries. But while predicting, I might have to send sensitive data in my English query to the model. I want to mask sensitive information like nouns and numbers in my English query. When I receive the predicted query, I want to unmask that data again.
In short, I need a python program that can mask nouns and numbers in my string and then unmask them whenever I want them to. We can replace it with anything you suggest.
Sample English Query:
How many Chocolate Orders for a customer with ID 123456?
Masking Expected Output:
How many xxxxxxxxxx Orders for a customer with ID xxxxxxxxx?
My algorithm with create the query like:
Select count(1) from `sample-bucket` as d where d.Type ='xxxxxxxx' and d.CustId = 'xxxxxxx'
Now I need the unmasked query like below:
Select count(1) from `sample-bucket` as d where d.Type ='Chocolate' and d.CustId = '123456'