10

Does restoring a SQL database from backup rebuild its tables and indexes from scratch? Or does it keep it in the same internal physical order it was in at the time of backup?

We're using SQL 2000 with Quest Lightspeed compressed backup, if that makes any difference.

masegaloeh
  • 18,236
  • 10
  • 57
  • 106
BradC
  • 2,220
  • 4
  • 27
  • 36

4 Answers4

16

The answer is no, for whatever backup software is being used.

A backup is a physical operation, not a logical operation. It reads all extents containing allocated pages (i.e. even though only a single page from an 8-page extent is allocated, it will backup the entire 64K extent), and it does it in physical order.

A restore is a physical operation, not a logical operation. It lays down the extents in their rightful places in the data files.

Rebuilding an index (or anything like it) is a logical operation, which must be logged. Backup and restore manipulate the data files directly, without going through the buffer pool, which is one reason why this cannot be done. Another reason this cannot be done is that backup and restore have no understanding of what is contained in the data being backed up.

The main reason this cannot be done, however, is that moving pages around during a restore operation would break the b-tree pointers. If page A points to page B, but page A is moved by the restore process, how is page B updated to point to page A? If it's updated right away, then it may be overwritten by the rest of the restore process. If it's deferred-updated, what if the restore process restored some transaction log that removed page A, or page B? It simply cannot be done.

Bottom line - backup and restore are physical operations that never change the data.

Hope this helps!

PS Although it doesn't directly address this question, check out the article I wrote for the July TechNet Magazine which explains how the various backups work internally: Understanding SQL Server Backups. The September magazine will have the next in the series on understanding restores.

Paul Randal
  • 7,194
  • 1
  • 36
  • 45
  • 2
    I love the fact that you explained that indexing is a logical operation. – Jim B Jun 29 '09 at 20:11
  • Ah, that makes sense. Backup grabs full 64k physical extents, not 8k data pages. – BradC Jun 29 '09 at 21:03
  • This is nonsense. Backup operations frequently compress the data they back up, and that's the kind of "change" we're talking about: After all, the indexes can be re-created from the tables, they are redundant data (provided that the schema is backed up, of course). The real reason is laziness on the part of the database developers. – John Mar 21 '17 at 13:32
  • 1
    @John What are you saying is nonsense? Compression isn't mentioned anywhere here - just rebuilding the indexes, which isn't in any way the same as compression (which doesn't change the pages or their location in the database during a restore, whereas a rebuild does). Recreating indexes from scratch during a restore would be incredibly slow compared to just restoring them as-is from the backup. I think you're misunderstanding something here. – Paul Randal Mar 28 '17 at 11:32
  • Dropping and rebuilding indexes is a type of compression, and compression could change anything, depending on the type of compression. Lossy compression in image processing is still called compression. Any mechanism that represents given information in smaller space is a form of compression, and dropping indexes is a trivial example of that. The slow rebuild time may indeed be a real argument though, at least against the implementation being a worthwhile endeavor. – John Mar 28 '17 at 11:42
  • 1
    @John The difference is that SQL backup compression operates only on the 64k physical extents that Paul describes, it doesn't know or care about the internal *meaning* of the bits. During restore the original 64k extent is recreated and written directly to disk as a physical operation. Also, Paul, who you replied to, WAS one of the MS SQL engine team developers you are decrying as "lazy". I think I'll defer to his expertise on these matters. – BradC Dec 07 '17 at 17:25
6

A native SQL backup is just a page-by-page dump of the backup files, so the answer there is "no". A Quest lightspeed backup likely uses some sort of compression compression algorithm, but it still won't "rebuild" the data files or indexes, which would take a horrendously large amount of time on a big database.

Aaron Alton
  • 1,138
  • 11
  • 10
  • Yeah, but it has to write every page to disk anyway. Why not write them in logical sequence, instead of in a fragmented order? (Maybe this is more of a *backup* question instead of a *restore* question: does the backup write the pages in physical order, or in logical order?) – BradC Jun 29 '09 at 19:44
  • assuming there was a product that wrote the data out in index order, which order would you like the table saved in? Lets say I have a table with 3 columns product_id, productname, and price. Which is the correct column to sort on to save the pages in the indexed order? BTW there is nothing stoping you from indexing on the entire table (a clustered index) or each of the rows (composite index). – Jim B Jun 29 '09 at 20:19
  • @Jim B: That's easy. Tables would be saved in clustered index order. Non-clustered indexes would be stored in key order. Heaps would be kept in original (non-sorted) order. (Aaron and Paul have mentioned valid reasons that backup/restore doesn't do this. Not being able to figure out the "preferred order" isn't one of these reasons. Or else doing a full index rebuild would have the same problem.) – BradC Jun 29 '09 at 20:40
  • The data is backed up in the exact same physical page order that it is saved in within the database files. When the data is restored it is restored in the same page order that it was backed up in. SQL doesn't move data around for various reasons. Including that there could be issues with transactions restoring logs and page chaining issues, not to mention the massive amounts of extra time needed to shuffle the data around on multi-TB databases. – mrdenny Jul 26 '09 at 07:38
2

Backup is done regularly and very often (I hope). So designers made sure backup is as quick as possible. What is the quickest I/O? Sequential. You read blocks from disks in exact physical order, you have the best performance.

Why on Earth should database perform cumbersome random I/O operation every single night, trashing the disks' heads all over the place? The difference would be around two orders of magnitude. There is no possible gain in this.

kubanczyk
  • 13,812
  • 5
  • 41
  • 55
  • I agree with your overall point, but depending on the underlying storage configuration, random I/O may not be orders of magnitude worse than sequential I/O (a SAN drive that is spread over dozens of spindles, for example). In fact, if the data file is fragmented on the hard drive, then even "sequential" I/O isn't really sequential at all. But Paul's points override this anyway (the problem with updating pointers, and that defragmentation should be a logged operation) – BradC Jun 29 '09 at 20:58
0

Hmmm. BradC, do you have worked with Firebird/Interbase before - where the main backup/restore utility/API is more alike the "Copy Database..." of the SSMS/EM ? If so, know that MS SQL Server is NOT like it.

A SQLServer Backup is more a database dump which is restored "AS-IS" - so it's more like a confortable online shortcut for an "detach-copy-reattach on other place" operation. The restored database is almost an exact copy of the original database file (almost because you can change the placement of database files of an restored database)...

Fabricio Araujo
  • 237
  • 3
  • 11