6

I have a small business network with several servers on it. To simplify things, I added a BIND9 DNS server on one of them, with entries for each of the local machines, i.e. vpn.example.com, web.example.com, storage.example.com, and so on. These entries are only served to the local network. When I do an nslookup on, say, vpn.example.com, I always get the expected, valid response. However, more often than not, an attempt to SSH to that server fails, as so:

# nslookup vpn.example.com
Server: 192.168.1.13
Address: 192.168.1.13#53

Non-authoritative answer:
Name: vpn.example.com
Address: 192.168.1.14

# ssh user@vpn.example.com
(after a ~10 second pause)
ssh: Could not resolve hostname vpn.example.com: Name or service not known

# ssh user@192.168.1.14
[Connects immediately]

Web requests to vpn.example.com succeed, as do connections from other applications.

This happens intermittently and seems to be tied to network or server restarts. After everything has been up for a day or two, the problem seems to go away, presumably as the client cache finally figures things out(?). I'm seeing it on my Mac and Windows machines. Any suggestions?

Andrew Schulman
  • 8,811
  • 21
  • 32
  • 47
Chris
  • 267
  • 4
  • 7
  • 1
    try running `ssh` with verbose mode `ssh -v` and post output. – alexus Aug 28 '14 at 16:28
  • Wouldn't you know, the whole 'issue disappearing after a few days' thing is kicking in. Let me go poke the servers with a stick and see if I can get em to do it again. God I hate debugging "intermittent" problems :( – Chris Aug 28 '14 at 16:43
  • look into your DNS, maybe problem is in there.. – alexus Aug 28 '14 at 16:53

1 Answers1

4

Since you have anonymized the domain name you have hidden important information. Does the domain name of the service you are connected to happen to end in .local?

Nslookup sends lookups directly to the DNS servers. SSH makes a call to the system asking for it to resolve the name, which may use DNS, but may also use a hosts file, multicast-DNS(bonjour), or other name-resolution protocols. So most likely one of your other configured name resolution methods is screwed up.

See the hosts line in your /etc/nsswitch.conf for the configured name resolution services.

Zoredache
  • 130,897
  • 41
  • 276
  • 420