3

We have about 40 file servers in our intranet to distribute software packages. The servers have names like example01, example02 etc. Every name resolves to a single IP-address (A-record) and the IP resolves back to that name (PTR) for every single server.

The thing is, that for a certain file (mypackage.cab) I get different results depending on whether I use:

\\192.0.2.01\fs\pkg\X12345678

or

\\example01.foo\fs\pkg\X12345678

While in one case the file is correct in the other case the file has exactly the right size, but it is all zeros. For a certain combination of client and server I can reproduce this reliably. It doesn´t matter if I download in Windows Explorer, via robocopy or even from Linux with smbclient. It´s always the same, one file corrupt, the other ok.

It happens only for certain combinations of clients and servers, not others. For example:

client01 example01.foo -> OK (192.0.2.01 is also OK)
client01 example02.foo -> broken (but 192.0.2.02 is OK)

client02 example01.foo -> broken (but 192.0.2.01 is OK)
client02 example02.foo -> OK (192.0.2.02 is also OK)

client03 example06.foo -> OK (but 192.0.2.06 is broken)
client03 example07.foo -> OK (192.0.2.07 is also OK)
etc...

In some cases I get the broken file when I use the IP address in other cases when I use the name. For every client the majority of servers is Ok, but from every client I tested I have at least 4 cases of broken files. All this happens only for mypackage.cab (about 5k in size), it never happened for any of the other files in the same directory.

Confused? Certainly I am. Any idea what can cause this or any idea what to try to figure it out is welcome.

Clients are Windows XP. Servers are NetApp filers I don´t have access to. I can (and will) contact the filer team again, but first I have to have an idea what is going on.

splattne
  • 28,508
  • 20
  • 98
  • 148
user4260
  • 191
  • 1
  • 7
  • Interesting bug. Do you think it's triggered by the name of the file, the contents, or its permissions / metadata? Have you tried changing any of these? (experiment on a copy in case you make the bug go away and can't get it back!) – Hugh Allen Apr 27 '10 at 13:53
  • @Unfortunately the distribution to the file server is not under my control, but checking the metatdata (file dates) was good hint and helped to figure out what is going on, See my own answer below. – user4260 Apr 30 '10 at 10:33

1 Answers1

2

Found an explanation for this strange behaviour. example01.foo, example02.foo, etc. are DFS servers. The real file servers are behind them. One of these file servers has a corrupt version of mypackage.cab.

I still don´t know how it works that a combination of a certain client with a DFS Server name or IP address always hits the same fileserver. At least this sounds like reasonable behaviour considering that these servers are spread all over the world.

Filer guys are currently fixing the corrupt file, will see it helps...

EDIT: This solved the problem.

user4260
  • 191
  • 1
  • 7