"I want to scrub(or encrypt) the email information from a few tables which are older than a few years"
I hope this means you have a date column on these tables which you can use to determine which ones need to be scrubbed. The most efficient way of tackling the job is to track that date in an operational table, recording the most recent date scrubbed.
For example you have ten years' worth of data, and you need to scrub records which are more than four years old. Now this would work:
update t23
set email = null
where date_created < add_months(sysdate, -48);
But it seems like you want to batch things up. So build a tracking table, which at its simplest would be
create table tracker (
last_date_scrubbed);
Populate the last_date_scrubbed
with a really old date say date '2010-01-01'
Now you can write a query like this
update t23
set email = null
where date_created
< (select last_date_scrubbed + interval '1' year from tracker);
That will clean all records older than 2011. Increment the date in the tracker table by one year. Run the query again to clean stuff from 2011. Repeat until you get to your target state of cleanliness. At which point you can switch to running the query monthly , with an interval
of one month , or whatever.
Obviously you should proceduralize this. A procedure is the best way to encapsulate the steps and make sure everything is kept in step. Also you can use the database scheduler to run the procedure.
"there is one downside to this approach. I thought that you want to be free upon choosing which rows to be updated."
I don't see any requirement to track which individual rows have been scrubbed. After all, the end state is that every record older than a certain date has been scrubbed. When I have done jobs like this previously all anybody wanted to know was, "how many rows have we done so far and how many have we still got to do?" Which can be answered by tracking the sql%rowcount
for each run.