18

I use SQL Server 2005 as a data store for a lot of data I do analytic work on. This is not a transactional database as I am not hitting it with updates or capturing real time data. I get a few gigs of data from my clients, load them into SQL Server and do a series of manipulations. I then grab bits of this data and pull them into R where I do most of my analysis. I then push a little data into tables in SQL Server and maybe do a join or two.

I have a heck of a time with the logs in SQL Server getting big and I assume it takes some degree of overhead to create these. How can I configure SQL Server so that it runs with little or no logging? If things get corrupt I am happy to just start from the beginning. Any ideas how to make this all faster?

BTW, no need to tell me how to shrink logs, I'm already doing that. But I wish I didn't have to make the logs in the first place. I'm only using the DB to house the data because its too big to fit into memory in R.

Should I be using a more simple DB than Sql Server? Feel free to tell me I am killing an ant with a sledgehammer. But please recommend a more appropriate sized hammer. :)

Nick Kavadias
  • 7,542
  • 2
  • 37
  • 46
JD Long
  • 59,675
  • 58
  • 202
  • 294
  • Why you want to be restrictive with shrink of database files: http://www.karaszi.com/SQLServer/info_dont_shrink.asp – JohnB Jul 13 '10 at 18:25

6 Answers6

13

How can I configure SQL Server so that it runs with little or no logging? I

I don't believe you can.

However if you configure the database (each database on a server can be different) for simple backups the log file won't grow until you back it up. This is done by setting the recovery mode to "simple".

With simple backups the log is only used to hold the state of transactions until they are fully written into the main database.

Richard
  • 106,783
  • 21
  • 203
  • 265
9

One way to avoid logging when working with large data sets, is using SELECT/INTO. It will create a new table but none of it will be logged.

There are some things to watch for in doing this:

  • Computed columns become regular data columns
  • Indexing and identity columns would need to be established too

When done properly it can save not only space but processing time.

The alternative is something like what I am doing right now, as an example:

UPDATE [MyTable] 
SET    [Message] = REPLACE([Message], N'Content_Type', N'Content-Type')

Works fine but updates the whole table creating one huge transaction set, instead you can do:

DECLARE @IDs TABLE ([id] int)
DECLARE @Batch TABLE ([id] int)

INSERT INTO @IDs ([ID]) SELECT [ID] FROM [MyTable]

WHILE EXISTS (SELECT TOP 1 [ID] FROM @IDs)
BEGIN
  INSERT INTO @Batch ([ID]) SELECT TOP 1000 [Id] FROM @IDS

  UPDATE [MyTable] 
  SET    [Message] = REPLACE([Message], N'Content_Type', N'Content-Type') 
  WHERE  [Id] IN (SELECT [Id] FROM @Batch)

  DELETE @IDs WHERE [Id] IN (SELECT [Id] FROM @Batch)
  DELETE @Batch
END

This updates the table 1,000 rows at a time keeping your transaction size down.

Oleks
  • 31,955
  • 11
  • 77
  • 132
Lee
  • 91
  • 1
  • 1
7

You can minimize log consumption in SQL server by changing the database recovery model to simple see this link. Since you're not dealing with concurrency and transactions have you considered Microsoft Access?

James
  • 12,636
  • 12
  • 67
  • 104
  • 1
    I moved to SQL Server because I was regularly banging my head on the 2GB limit in Access. I almost asked this question in the form of "how can I get SQL Server to act more like Access" but I was afraid I was going to get a lot of BS about how Access sucks, yada yada. I just need a good data store! – JD Long Feb 21 '09 at 01:58
  • Can you split your data up into several Access database files? The linked table semantics in Access would make it very easy and logically sound to setup a main file that refers to several child Access data files. – James Feb 21 '09 at 20:15
  • 1
    Good idea, but accommodating multiple Access tables would be a total kludge that would hamper analysis. Having this in SQL Server also lets me push the expensive queries to a more powerful server. Access would require me to do those queries on the client machine. – JD Long Feb 25 '09 at 00:47
  • @mghaoui. sarcasm and jokes are frowned upon on stackoverflow. I dont beleive any sarcasm about Access was implied here. I thought these remarks and this answer was really informative. Thanks. – TamusJRoyce Aug 07 '11 at 06:15
3

You won’t make your SQL Server almost much faster by turning off transaction logging but the log size can be made smaller by going to simple or bulk logged recovery mode as others already suggested.

My take on this is that you should never turn of full recovery mode except in special cases like yours when it’s definitely not needed.

Main reason for this is that transaction log in full recovery can be your only hope of recovery in case of accidently executed UPDATE, DELETE or TRUNCATE where you don’t have backups or all data is not in the backups.

There are several threads on this topic where reading transaction log was the last hope for recovery.

How can I rollback an UPDATE query in SQL server 2005?

How to undo a delete operation in SQL Server 2005?

Again, in your specific case this is probably not an issue but my guess is that it can be useful to others.

Community
  • 1
  • 1
LarryB
  • 586
  • 6
  • 5
2

to minimize logging use simple recovery model and do your work in batches.

Mladen Prajdic
  • 15,457
  • 2
  • 43
  • 51
  • I was re-reading these responses and the mention of batches caught my eye. Can you give me more insight into what you mean by doing things in batches? If I do a long script with 30 steps is that different than running 30 scripts? Thanks for your help. – JD Long Mar 11 '09 at 13:21
  • by batches i mean for example if you have to update/delete 50.000 rows do that in batches of 1000. and each batch in it's own transaction. you can do this with a while loop. for inserts use bulk insert capabilities. – Mladen Prajdic Mar 12 '09 at 10:32
  • 1
    in the real world, at least in Oracle (ouch!), it's always faster to process your data as a complete set, not split it up into smaller bites. COMMITs take work, as do beginning and ending transactions. Another piece of advice is that the fastest way to update ALL (or most) of the rows in a table is to create a new table. – Neil Kodner Oct 19 '09 at 15:57
-1

Code using EntityFramework to configure your database like Richards answer describes:

using (var dbInstance = new YourEntityFrameworkDB_Context())
{
    var sqlConfigConn = dbInstance.Database.Connection as SqlConnection;
    sqlConfigConn.Open();

    using (var sqlCmd = new SqlCommand())
    {
        sqlCmd.Connection = sqlConfigConn as SqlConnection;
        sqlCmd.CommandText = String.Format("ALTER DATABASE model SET RECOVERY SIMPLE");
        var result = sqlCmd.ExecuteNonQuery();
    }
    sqlConfigConn.Close();
}

And to check if it was successful just start Management Studio and run: Screenshot Management Studio


EDIT Feb 2018:

MSDN description about the recovery model

╔══════════╦══════════════════════╦══════════════════════════════════════════╗
║ Recovery ║    Description       ║      Recover to a point in time?         ║
║  model   ║                      ║                                          ║
╠══════════╬══════════════════════╬══════════════════════════════════════════╣
║ Simple   ║ No log backups       ║ Can recover only to the end of a backup. ║
║          ║                      ║                                          ║
║ Full     ║ Requires log backups ║ Can recover to a specific point in time, ║
║          ║                      ║ assuming that your backups are complete  ║
║          ║                      ║ up to that point in time.                ║
║          ║                      ║                                          ║
║ Bulk     ║ Requires log backups ║ Can recover to the end of any backup.    ║
║ logged   ║                      ║                                          ║
╚══════════╩══════════════════════╩══════════════════════════════════════════╝
MarkusEgle
  • 2,795
  • 4
  • 41
  • 61