7

We have a new Domain Controller that holds all FSMO roles. We also have two old hardware servers, about 4-5 years old each, set up as secondary Domain Controllers (or better described as Domain Controllers that are DNS servers too not holding any FSMO roles but are Global Catalog servers in the same Site). My question is do I run the risk of corruption in Active Directory if I have a drive failure, due to old hardware, on one of the secondary Domain Controllers. I am really trying to convince the client to buy a new hardware based server for our second Domain Controller but again budget is tight. Thanks.

dasko
  • 1,244
  • 1
  • 22
  • 30
  • 1
    If the budget is tight, maybe they'll consider a new, lower-spec energy efficient server that will save them money in electricity costs, or a used/refurbed offering. Hell, I recently picked up a practically new server on eBay for 30% of cost, with an active manufacturer warranty. Even on a tight budget, there are lot of better options than running old, failing servers in production. – HopelessN00b Jul 15 '12 at 16:39

3 Answers3

8

I say yes, there's a small risk, but in all reality

NO

, a corrupted or failed drive will in all likelyhood not ruin your AD environment. Here is why:

1) A drive failure will render the data unreadable. Unless this is the only DC you have (or it's the only Global Catalog), this is not a problem. (If you only have one DC that is a GC, or only one Global Catalog, you need to stand up another post-haste!)

So now, we're only talking corruption:

2a) In order for corruption to modify AD, it would have to modify (let's assume a simple bit-flip) the AD binary database files in such a way that changed the data to a new value that is consistent and compatible with the AD Schema for that object.

(This would likely register a consistency-check error, and AD would throw error messages and possibly throw away the damaged parts and pull a fresh copy of the AD Data itself.)

2b) The bit-flip would then have to register a valid change to the data and update the USN (Update Sequence Number), or the bit-flip would then have update the USN to a valid USN in the future. If the bit-flip changed the USN to a sequence number in the past, it would see itself as having out-of-date records and pull the current USN from the other DCs.

Keep in mind that unless anonymous changes are allowed by your AD (which is not the default; I'm not even sure it's possible, but would be a huge security no-no), a successful authentication and permissions check is required to modify AD. What credentials are used in disk-corruption? Again, another cause for a consistency-check failure.

So, the corruption would have to change the data in a meaningful way, provide an valid authenticated user account, and either trigger an update to the USN or itself update the USN to a valid future value. If it did all of those things, YES, it could corrupt your AD environment. It is absolutely possible, but it is highly HIGHLY unlikely.

What is most likely to happen is that AD will choke and throw errors on that server, but the other DCs will be just fine.


With all of that said, you should absolutely replace failed or failing hardware as soon as possible.

gWaldo
  • 11,957
  • 8
  • 42
  • 69
  • 1
    that was a nice explanation on your point, thanks. – dasko Jul 13 '12 at 18:54
  • 2
    I agree. AD uses the Extensible Storage Engine (ESE) as its underlying database engine, and the database pages are checksummed. You would be seeing "JET" errors in your "Directory Service" event log on the "corrupt" DC if the database was being subject to "bit rot". It's highly unlikely that any "corruption" in the database would ever make it all the way out to other DCs via replication. I don't agree w/ your premise re: on-disk corruption having some kind of authentication-related interaction, however. Corruption of data by the disk itself would be "trusted" by Windows. – Evan Anderson Jul 13 '12 at 19:24
  • I wish I had a way to test my premise, @EvanAnderson. I think that a bit-flipped valid-data-change-on-disk but not recorded as an authenticated event (and without a USN update) would throw consistency errors as well. – gWaldo Jul 13 '12 at 19:33
6

Yes, there is a risk. There is no such thing as secondary Domain Controllers, AD is a Multi-Master setup. So if you have corruption on one you could corrupt your AD Database.

Zypher
  • 37,405
  • 5
  • 53
  • 95
  • Ok that is what i was afraid of, better to then to leave this "other" Domain Controller that has clicky drives offline and do a ntdustil cleanup of the meta data. We had a drive failure last night and have been questioning whether i should put it back on the network but i have not. This sort of makes me think to leave offline and tidy up AD. Does that make sense to you too? Thanks. – dasko Jul 13 '12 at 16:04
  • 1
    yea, leave it off, clicky drives == bbaaaad – Zypher Jul 13 '12 at 16:21
  • 1
    This risk - while present - is HIGHLY unlikely. This is why AD has a schema, consistency checks, and by default requires changes to be made by an authenticated account with the required permissions. – gWaldo Jul 13 '12 at 17:51
  • 1
    *click click click click* **usn update with bad data** *click click click click* – MDMarra Jul 13 '12 at 17:51
  • Yes, the data can be changed, BUT it has to be changed to *valid* data. How often does the *click-click-click* of a read head hitting the platter flip a bit? All I've ever seen it do is make data unreadable. – gWaldo Jul 13 '12 at 17:58
  • @gWaldo This is true I wrote that comment before I saw yours. It wasn't in response to it. Though, when it comes to Domain Controllers, you can never be too careful. – MDMarra Jul 13 '12 at 18:11
  • @MDMarra I agree; I provided more fleshed-out thoughts in a (currently lower-ranked) answer. – gWaldo Jul 13 '12 at 18:14
3

If you cannot replace your "secondary" DC's, it may be an idea to set them up as read only domain controllers (RODC).

These will basically replicate AD from your "primary" DC and can be queried against, but no changes to AD can be made from these machines. Therefore if one was to be corrupted, you could take it offline and have no risk of corruption of AD.

Robin Gill
  • 2,513
  • 14
  • 13
  • this is for a server 2003 environment, i think you might be meaning in a server 2008 you can use an rodc, i don't think i made that clear though in my first post. thanks. – dasko Jul 13 '12 at 17:23