16

According to the author of this story, an improperly configured email server had too strict timeout settings which effectively limited its communication radius to 500 miles.

The author states it's the maximum range a signal can travel there and back with the speed of light within the timeout interval set up and speculates that this was the real reason for the distance constraint.

However, in my opinion, this story as it is presented does not hold water.

Can it be true?

Sklivvz
  • 78,578
  • 29
  • 321
  • 428
Quassnoi
  • 4,315
  • 5
  • 30
  • 49
  • 5
    Why do you doubt it? – Oddthinking Jan 25 '14 at 09:05
  • What type of proof would you need other that what is already there (3 millilightseconds = 558.84719 miles)? – nico Jan 25 '14 at 09:57
  • 3
    @oddthinking: because network packets do not travel with the speed of light, to begin with. – Quassnoi Jan 25 '14 at 09:59
  • @Quassnoi Close enough. *Very* close. – Konrad Rudolph Jan 26 '14 at 13:46
  • 1
    @KonradRudolph: specs for the UTP5 cable I used for wiring my house mentioned velocity factor of 0.66 at 125 MHz. – Quassnoi Jan 27 '14 at 06:36
  • 2
    @Quassnoi Which is pretty darn close to c. It may change the details of the story but not the gist of it. These are probably all back-of-the envelope calculations anyway. I never took them for anything else. – Konrad Rudolph Jan 27 '14 at 11:28
  • 1
    Another story I heard, at least 30 years ago, concerns a Digital Equipment Corp operator who wanted to map the structure of their network (all running VMS). He structured a query so his system would ask the nearest others, and get them to do the same. Within seconds the entire network was flooded with every system querying all the others. Anyone know if this happened? I have no evidence, only a story by an old system admin. – hdhondt Nov 09 '15 at 22:42

1 Answers1

28

It's true, but not all the details are accurate

The author posted a FAQ post about the case where they explain that they did take some creative liberty both for simplification and because they didn't remember all the details:

If you're not 100% certain of all the details, why did you write the original post so vividly?

I took license. It made a better story that way. Honestly, would writing "I'm not sure, but maybe..." every second sentence really have changed anything? And I did say at the start that I was changing or eliding irrelevant details for the sake of the story.

Another factor is the context in which this originally appeared. It was posted to the sage-members list, a list for members of SAGE, the System Administrators Guild, in a thread about "favorite impossible tasks". That is to say, it was a light-hearted discussion about the impossible tasks that users or management sometimes bring to sysadmins to solve.

If I had any notion that this post would be forwarded so widely, I would have added some details to answer the questions of skeptics. But I was posting to a list of colleagues, a good proportion of whom I know personally, and so was tailoring it for an audience inclined to believe me. :-)

The story is fun, but the technical details at the end just don't add up.

I know. See the answer to the previous question. First and foremost, I was writing a humorous story based on an actual occurence, not a case study, so I wrote only what was required to get the point across. I find I suddenly have great respect for writers who write works based on true stories, as I now know how hard it is to find the balance between verisimilitude and storytelling. And I now know the kind of criticism you open yourself up to when you choose storytelling over verisimilitude. :-)

...

That three millisecond time doesn't make sense as the timeout for a connect() call.

Yes, I know. And it wasn't the timeout, actually. In the story, I make it sound like it took all of ten minutes from being made aware of the 500-mile email limit and determining a 3 ms light-speed issue. In fact, this took several hours, and quite a bit of detective work. The point is, eventually I came up with that figure, ran units, and gagged on my latte. (I'm fairly certain it was a different latte from the one I started with.) So what, in particular, is your question about the 3 ms figure?

Well, to start with, it can't be three milliseconds, because that would only be for the outgoing packet to arrive at its destination. You have to get a response, too, before the timeout will be aborted. Shouldn't it be six milliseconds?

Of course. This is one of the details I skipped in the story. It seemed irrelevant, and boring, so I left it out.

Actually, shouldn't it be twelve/eighteen/twenty-four milliseconds, to account for the three-way TCP handshake?

Maybe. Again, this would be a detail my lost notes would answer. But I think that a connect() timeout would be aborted upon receiving a SYN/ACK packet; I don't think that the whole handshake had to be completed. Even if it did, I eventually would have arrived at the 3 ms figure, however I got to it.

Router delays would have been a much greater factor than you admit in the story.

Yes, you're probably right. But they are factors I could account for. I'm not certain this is how I did it, but it seems likely I could have pinged the nearest router I could find (such as one at another school at UNC that manged its own network) to find out what sort of delay a router was likely to add. Then I could multiply that by the number of routers to remote destinations. The number was likely to be constant for other East Coast universities. And even if it wasn't, the delay imposed by an additional router would only be on the order of a few hundred microseconds at most, not enough to make a large difference for nearby destinations.

The story is cute, but it has a fatal flaw: signals don't travel at lightspeed in copper.

That's true, they travel at 3 c / 4 or thereabouts. But the NIC, the campus backbone, and certainly the Internet backbone was all fiber.

Ah-hah! But signals don't travel at light speed in fiber, either!

You got me. I'm told they travel at from 2 c / 3 (yes, slower than copper) up to a few percent under c depending on a wide variety of factors. But again, this was a factor I could, and did, account for. I recall pinging various destinations and writing down distances versus ping times, and coming up with an empirical "effective time" that differed from actual time. This was just another "irrelevant and boring detail" to be left out of the story.

...

Well, the story still can't be true.

Let me ask you a question: regardless of the details, is it possible that a misconfiguration could cause the operational behavior of nearby email being delivered while faraway email was not? I think the answer is yes. In fact, I know the answer is yes, because it happened. But even putting aside my own experience and viewing it as best I can as a skeptical observer, I think the idea is possible, though certainly implausible at first gloss.

If you have a question that isn't answered here, go ahead and email me at trey+500mi@lopsa.org. I may put it in the FAQ and credit you. But I'm more likely to just say, "I don't know, I don't remember and no longer have the raw data to answer your question."

SIMEL
  • 29,037
  • 14
  • 123
  • 139
  • 4
    This FAQ basically boils down to one thing: "I made up some details (and don't even remember which), but this COULD have happened, right?" – Quassnoi Jan 27 '14 at 06:45
  • 8
    No, it's more "This has happened to me a long time ago and I don't remember all the details, so I made the missing details up." – SIMEL Jan 27 '14 at 11:49
  • Is this evidence that the story is true? Or is it just a clarification of the story? –  Jan 29 '14 at 06:51
  • 3
    @Articuno, you need to see this post in its proper context. This is not a professional test study, but rather an anecdotal story. The author gives this as an example of something that happened to them, while acknowledging that the story is lacking on the precise details and is intended more for entertainment purposes than actual case studies. The story is plausible, as giving a too small timeout will make network communication impossible, in my networks class we had some exercises depicting just this. – SIMEL Jan 29 '14 at 09:27
  • 1
    @IlyaMelamed I'm just curious how you justify the first two words of the answer: "It's true". This post is good at clarifying the extraordinary parts of the original claim to perhaps lessen the asker's incredulity, but I don't think "it's true" is justified. I appreciate the difficulty in verifying an anecdote, though. This answer is probably the best we can do. I'd just say "the story is not as unlikely as you think"... or something like that. –  Jan 29 '14 at 15:43
  • 1
    @Articuno: for me, the key point is whether this story could have happened on a *real* setup. If someone wrote something like "look at sendmail 5 sources, here are `connect` and `poll` separated by code which would execute for 3 (or 20 or 50) ms on a piece of hardware they were likely to have", I'd take it for a proof. Likewise, if they showed something like "look, connect timeout would be exactly 0 or a multiple of a second", I would take is for a disproof. I'm probably asking for too much but now that we have IP-over-pigeon actually implemented I wouldn't be surprised if someone tested it! – Quassnoi Jan 29 '14 at 17:26
  • 1
    @Quassnoi - A very late suggestion, but I'd say asking whehter the *story is true* is different from asking whether *such a story is feasible*. The first is answered as much as it can be by this quote from the original author - he says it happened to him, and there's no way to confirm it. The second would incorporate his extra details to ask whether such a setup is **actually possible**, which is an entirely different question. – Bobson Oct 19 '14 at 00:09