0

We run a small wired LAN with 3 DCs (W2k8) and about 25 workstations (most of which are XP SP3, some are 7 SP1.) People use roaming profiles with folder redirection for Desktop & My Documents, Application Data and Start Menu. The redirected folders sit on a DFS-R share across the 3 DCs. This setup has been used for about a year.

The whole folder redirection has been, to say the least, a nightmare to us. Users are put offline often, and apparently randomly, with Windows asking for synchronisation despite the users not being offline at all (all pings are fine, and other maybe less connectivity-sensitive services keep running fine.)

People got used to live with it (so to say) and synchronise every so often. At least Windows 7 seems to deal better than XP regarding this whole offline/sychronisation issue in the sense that it doesn't bother people so much with popups... etc. I have searched quite a lot for what could be the cause of this with no success so far. At this stage I still don't even have a clue as to this issue is even software related or not.

However there is at least one clear offline occurrence that I have spotted over this past year: when one of our DCs restarts, some users are being put offline, despite the two other DCs remaining up and running. Surely this should not happen, even for users who got their DHCP lease through the restarting DC. This makes me thing that something might be misconfigured that could lead me to the cause of the more general offline/synchronisation issue.

mks-d
  • 7
  • 7
  • 1
    how may dfs root servers do you have in your namespace? – Rex Nov 16 '12 at 14:52
  • 3, the 3 DCs basically, they replicate some key folders, among which the redirected folders. – mks-d Nov 16 '12 at 15:21
  • ok you are not just using folder redirection you are also using csc (client side cache) and csc itself is very intolerant of network interuption or slowdowns. You can turn the CSC component off if your users are desktops and not mobile. – tony roth Nov 16 '12 at 15:25
  • replication targets are not the same as roots.. just want to make sure we're talking about the same thing.. – Rex Nov 16 '12 at 15:25
  • @tony Correct, I'm using CSC. I'm afraid of turning CSC off, I remember having done that when setting up the DFS back in the times and running Folder Redirection with CSC off wiped out a user's redirected folders without the slightest notice. Very scary. – mks-d Nov 16 '12 at 15:46
  • @dindeman yep thats true, you are having the same issue the resync would be the equivalent of the files missing. Your network is not performing up to specifications, check nic speeds all around both server and workstations. When we get have this problem its when somebodys gone out of there way and missed configured a switch port or nic on a server/workstation. Also the patches listed by Greg are important. – tony roth Nov 16 '12 at 15:50

2 Answers2

1

What version of the CSC files are you running? Given that there are a lot of known issues with offline files functionality, you may want to try updating those files and see if if it resolves the issue. A recent version is available here: http://support.microsoft.com/kb/2705233

Greg Askew
  • 35,880
  • 5
  • 54
  • 82
  • I'd say if the users are not mobile just turn csc off. – tony roth Nov 16 '12 at 15:28
  • Most of my clients are XP SP3. Moreover the issue described in the link is not what is experienced. DFS shares are in fact very rarely unavailable, despite the offline mechanism that gets triggered. – mks-d Nov 16 '12 at 15:53
0

OK, you've put a lot out there, let me try to break things down:

  1. When you say that users are being "put offline" when one of the DC's restart do you mean that they lose network connectivity (they lose their DHCP assigned ip address) or do you mean that they lose access to their redirected folders? If the latter then take DHCP out of that sentence because it's completely unrelated to folder direction, except in the fact that the client needs network connectivity in order to access their redirected folders.

  2. Ping isn't a very good network troubleshooting tool. Sure it can tell you if a host has network connectivity and it can tell you the relative response time of that host but it tells you nothing about what's going on in the network. Try running a packet capture on one of the clients or one of the servers. Look for symptoms of network congestion, like a high volume of broadcast traffic (layer 2 and layer 3 broadcasts) and look for things like a large volume of TCP retransmits and duplicate ACK's. Those are both sure signs of network congestion.

  3. Take a look here for tips on diagnosing DFS problems: http://blogs.technet.com/b/askds/archive/2009/09/29/o-dfs-shares-where-art-thou-part-1-3.aspx. My guess is that the clients that are affected by the DC being down have a referral to their redirected folders via the down DC, which makes sense. If you can run DFSUTIL /PktInfo and/or DFSUTIL /SpcInfo on one of the affected clients when the DC is down you can see which DC is the active referral for the namespace.

joeqwerty
  • 109,901
  • 6
  • 81
  • 172
  • 1. "Put offline" in the sense of the "Windows offline files " (and the subsequent synchronisation that Windows offers.) They don't seem offline at all and they can even keep browsing the DFS-R shares straight after the prompt. I hear you about the mixup with the DHCP concept but I'm trying to spot why a workstation starts thinking it's "offline" and needs to sync again. I have no idea what triggers the offline files behavior in Windows, I just know that some workstations go offline when "a" DC restarts, maybe the one that granted the lease? In which case it shouldn't happen. – mks-d Nov 16 '12 at 15:33
  • 3. I'm not sure what a referral exactly is but I can only tell you that no client ever accesses a DC directly, meaning otherwise than via the DFS path (\\DOMAIN.com\root\...) Maybe a referral is a preferred DFS server at a given time? But then how do I control which one is assigned? And if restarting that one puts referred clients offline, then it defeats the concept of DFS. How would people do in big organisations with dozens of DFS servers if restarting one of them brings hundreds of users "offline"? – mks-d Nov 16 '12 at 15:40
  • 2. You're right, clearly this will require to further analyse the overall quality of our LAN. I must say that I suspect hardware issues if not even wiring issues. However since there was a clear case of offline workstation switch (restarting any of the DCs), I hoped it could trigger a bell. – mks-d Nov 16 '12 at 16:02
  • The clients access a DFS namespace via one of the DFS servers which hosts the namespace, which are your DC's. This is called a referral. The referral has a "TTL". If the client is connectyed to the namespace via a referral from the down DC then that namespace won't be available for the life of the referral. I'm not a DFS guru but this is my understanding of it. Have a look here about what happens during a failover: http://help.globalscape.com/help/wafs4/using_microsoft_dfs_for_failover.htm. – joeqwerty Nov 16 '12 at 16:24
  • Thanks for the link! It sounds like it's never a good idea to restart a DFS server as you never know whether it's a referral somewhere at that time. At least it makes sense that restarting a server will put offline some clients, I mean I "knew" it but was assuming that the DFS was smart enough to handle a member down whatsoever. However my main issue is in fact elsewhere, i.e. finding out why users are put offline so often, even with no DFS share down. What would be the best way to analyse the DFS protocol to understand its downfalls, something like Wireshark? – mks-d Nov 16 '12 at 17:22
  • I'm not a DFS guru so I can't offer much insight into how best to analyze it but in relation to your problem, it may just be that the Offline Files mechanism is more sensitive than the client DFS component in the sense that Offline Files detects the absence of the DFS share quicker than DFS can home itself to an available DFS server for a new referral. This seems to be related - http://blogs.technet.com/b/askds/archive/2011/12/14/slow-link-with-windows-7-and-dfs-namespaces.aspx – joeqwerty Nov 16 '12 at 17:28
  • Thanks again, that one sounds very interesting indeed. And yes, after experiencing the issue for a year I would rather think it's the Offline Files mechanism that could be the culprit. Will read the whole thing in detail and come back to you. – mks-d Nov 16 '12 at 17:33
  • Glad to help... – joeqwerty Nov 16 '12 at 17:34
  • on a locally connected network you should not be butting up against a slow link detection issue, I think the slow link detection is logged in the system event log. – tony roth Nov 16 '12 at 18:22
  • @tonyroth - What I'm wondering is if the fact that the client loses it's connection to the DFS server it got it's referral from is "emulating" a slow link detection in Offline Files and causing this issue. – joeqwerty Nov 16 '12 at 19:27
  • @joeqwerty: I was also thinking this. Given that the slow link timeout is 64ms on xp, it seems unlikely dfs would do anyhing before the csc trigger. Too bad they cannot just increase the slow link timeout to a very large value on xp. – Greg Askew Nov 17 '12 at 06:09
  • @Greg Askew - I changed the slow link threshold GPO before yesterday and (due to the weekend) I haven't got any feedback yet. So you're saying this GPO will only affect 7 clients and not XP clients? – mks-d Nov 18 '12 at 14:29
  • @dindeman: XP you can configure the kbps. Windows 7 is more sophisticated, you can configure ms latency, per share. Check the GPO spreadsheets. See: https://blogs.technet.com/b/askds/archive/2009/02/11/configure-slow-link-mode-policy-on-vista-for-offline-files.aspx and https://blogs.technet.com/b/askds/archive/2011/12/14/slow-link-with-windows-7-and-dfs-namespaces.aspx – Greg Askew Nov 18 '12 at 15:23
  • @Greg Askew - Thank you, those links were the ones already provided by joeqwerty and that I followed for Windows 7, but where is the threshold bandwidth specified for XP? – mks-d Nov 19 '12 at 01:53
  • Ok sorry [here](http://offlinefiles.blogspot.com/2009/12/group-policy-slow-link-configuration.html), will try that tomorrow as well. So far I don't have any more complaints from 7 users, still early to tell but finger crossed ;-) – mks-d Nov 19 '12 at 16:09