2

I need to move the data that is a month old from a logging table to a logging-archive table, and remove data older than a year from the later.

There are lots of data (600k insert in 2 months).

I was considering to simply call (batch) a stored proc every day/week.

I first thought about doing 2 stored proc :

  1. Deleting from the archives what is older than 365 days
  2. Moving the data from logging to archive, what is older than 30 days (I suppose there's a way to do that with 1 sql query)
  3. Removing from logging what is older than 30 days.

However, this solution seems quite ineficient and will probably lock the DB for a few minutes, which I do not want.

So, do I have any alternative and what are they?

Filip De Vos
  • 11,568
  • 1
  • 48
  • 60
Kraz
  • 6,910
  • 6
  • 42
  • 70

2 Answers2

3

None of this should lock the tables that you actually use. You are writing only to the logging table currently, and only to new records.

You are selecting from the logging table only OLD records, and writing to a table that you don't write to except for the archive process.

The steps you are taking sound fine. I would go one step further, and instead of deleting based on date, just do an INNER JOIN to your archive table on your id field - then you only delete the specific records you have archived.

As a side note, 600k records is not very big at all. We have production DBs with tables over 2billion rows, and I know some other folks here have dbs with millions of inserts a minute into transactional tables.

Edit:

I forgot to include originally, another benefit of your planned method is that each step is isolated. If you need to stop for any reason, none of your steps is destructive or depends on the next step executing immediately. You could potentially archive a lot of records, then run the deletes the next day or overnight without creating any issues.

JNK
  • 63,321
  • 15
  • 122
  • 138
  • Thanks for the answer! I'll guess I can go with my "simple" solution. However, I'm somewhat confused at what you are suggesting me for the `inner join` part. Would you care to explain a bit more? It's for moving the data or making the move useless? – Kraz Jul 21 '11 at 19:01
  • @Kraz - for the `DELETE`. If you `INNER JOIN` on your archive table in the `DELETE` statement, you will only delete rows that are archived. If you `DELETE` based on a date, there's a theoretical potential to delete something that wasn't archived for whatever reason. – JNK Jul 21 '11 at 19:02
  • @Kraz - happy to help. I do a similar process for multiple tables and having it broken up really helps with QC. – JNK Jul 21 '11 at 19:46
0

What if you archived to a secondary database.

I.e.:

Primary database has the logging table.

Secondary database has the archive table.

That way, if you're worried about locking your archive table so you can do a batch on it, it won't take your primary database down.

But in any case, i'm not sure you have to worry about locking -- I guess just depends on how you implement.

Chains
  • 12,541
  • 8
  • 45
  • 62