0

I have a database in production and I would like to run an ETL process to DELETE some records (200 millions) of the database but since the database is in FULL model every time I try to run the ETL the Logs file get out of space.

In order to avoid that I would like to change the recovery model to simple, after I have done the process of cleaning the table I will put the recovery in full model again.

Of course before start the process I will backup the Database.

There is any Problem doing that, any recommendations??

Any assistance in this regard will be greatly appreciated.

JC_BI
  • 419
  • 1
  • 5
  • 16

2 Answers2

1

Switching the recovery model back and forth is a bad idea. It is very easy to get the database into a bad state and could easily void out your database backups.

First option would be to get more storage so that as the log grows, you won't run out of the space. Sometimes this easier said then done so your next option would be to run a transaction log backup at a set interval while the ETL is running. This will allow transactions to be committed and for you to keep the log file from filling up. This will additional disk I/O so performance may suffer from this. The third option (one I prefer and use frequently in these situations) is do all of my ETL processing in a staging database where the recovery is set to SIMPLE. The last step of the ETL is to simply update the production database with cleaned data.

tj-cappelletti
  • 1,774
  • 12
  • 19
  • Do you think doing update will be faster thank delete? – JC_BI May 02 '18 at 21:10
  • There are a lot of factors to consider such as disk performance, keys and indexes. It's certainly possible, but hard to say without knowing more details about the table structures and the ETL process. – tj-cappelletti May 02 '18 at 23:11
0

It is about how often you are updating data and whether you need to recover to a particular point in time. For example, if your database is populated once-daily ETL batch then you're better off keeping the database in the SIMPLE recovery model and performing a full or differential back up after the batch completes.

Also, you would typically perform bulk loads in a Datawarehouse ETL process, and FULL recovery can hinder performance due to every transaction being fully logged. If you need point-in-time recovery, and have chosen the FULL recovery model, then you may want to consider switching to the Bulk-logged recovery model for the duration of your ETL process, so that your bulk operations are minimally logged. Reference

ASP
  • 666
  • 4
  • 9