5

I'm lost when trying to figure out when a DNS query will be timed out. Tried multiple scenarios(on Linux):

  1. No name server configured in /etc/resolv.conf

    ###################### curl #######################
    WRITE_OUT="%{http_code}\t%{time_namelookup}\t%{time_connect}\t\t%{time_starttransfer}\t\t%{time_total}\n"
    
    time curl -k -w "$WRITE_OUT" https://www.google.com/
    000     0.000   0.000           0.000           0.000
    
    curl: (6) Could not resolve host: www.goole.com; Unknown error
    
    real    0m0.009s
    user    0m0.000s
    sys     0m0.006s
    
    ##################### nslookup ####################
    time nslookup www.google.com
    ;; connection timed out; trying next origin
    ;; connection timed out; no servers could be reached
    
    real    0m24.012s
    user    0m0.004s
    sys     0m0.009s
    

    As we can see, curl returns immediately(9ms), while nslookup takes much longer(24s). This makes me very confused, curl's behavior makes more sense as there is no name server specified on the host.

  2. Add an unreachable host IP in /etc/resolv.conf, cannot ping to simulate name server down scenario

    ###################### curl #######################
    time curl -k -w "$WRITE_OUT" https://www.google.com/
    000     0.000   0.000           0.000           19.529  
    curl: (6) Could not resolve host: www.goole.com; Unknown error
    
    real    0m20.535s
    user    0m0.003s
    sys     0m0.005s
    
    ##################### nslookup ####################
    time nslookup www.google.com
    ;; connection timed out; trying next origin
    ;; connection timed out; no servers could be reached
    
    real    0m20.008s
    user    0m0.006s
    sys     0m0.003s
    

    Hurray! Looks like curl and nslookup on the same page.

  3. Add an host IP address which can be pinged, but without DNS services, to simulate server is alive but name server service is down

    ###################### curl #######################
    time curl -k -w "$WRITE_OUT" https://www.google.com/
    000     0.000   0.000           0.000           4.513
    curl: (6) Could not resolve host: www.goole.com; Unknown error
    
    real    0m5.520s
    user    0m0.004s
    sys     0m0.005s
    
    ##################### nslookup ####################
    time nslookup www.google.com
    ;; connection timed out; trying next origin
    ;; connection timed out; no servers could be reached
    
    
    real    0m20.010s
    user    0m0.006s
    sys     0m0.005s
    

    Confused again!

The most confusing part is, from the Manual page of resolv.conf, we can that the default value of timeout is 5 seconds, and attempts is 2 times. So I suppose the timeout should be 5 seconds * 2 = 10 seconds. But...confusing...

Edit: Tried again with modifying /etc/nsswitch.conf, only dns method is used. hosts: dns

Scenario 1:

###################### curl #######################
time curl -k -w "$WRITE_OUT" https://www.google.com/
000     0.000   0.000           0.000           0.000
curl: (6) Could not resolve host: www.google.com; Unknown error

real    0m0.051s
user    0m0.004s
sys     0m0.002s
##################### nslookup ####################
time nslookup www.google.com
;; connection timed out; trying next origin
;; connection timed out; no servers could be reached

real    0m24.287s
user    0m0.005s
sys     0m0.014s
######################## dig ######################
time dig www.google.com

; <<>> DiG 9.9.4-RedHat-9.9.4-51.el7 <<>> www.google.com
;; global options: +cmd
;; connection timed out; no servers could be reached

real    0m18.041s
user    0m0.005s
sys     0m0.005s

Scenario 2:

time curl -k -w "$WRITE_OUT" https://www.google.com/
000     0.000   0.000           0.000           19.527
curl: (6) Could not resolve host: www.google.com; Unknown error

real    0m20.533s
user    0m0.003s
sys     0m0.004s

time nslookup www.google.com
;; connection timed out; trying next origin
;; connection timed out; no servers could be reached

real    0m20.009s
user    0m0.005s
sys     0m0.005s

time dig www.google.com
; <<>> DiG 9.9.4-RedHat-9.9.4-51.el7 <<>> www.google.com
;; global options: +cmd
;; connection timed out; no servers could be reached

real    0m15.008s
user    0m0.005s
sys     0m0.003s

Scenario 3:

time curl -k -w "$WRITE_OUT" https://www.google.com/
000     0.000   0.000           0.000           4.512
curl: (6) Could not resolve host: www.google.com; Unknown error

real    0m5.518s
user    0m0.004s
sys     0m0.003s

time nslookup www.google.com
;; connection timed out; trying next origin
;; connection timed out; no servers could be reached

real    0m20.009s
user    0m0.005s
sys     0m0.005s

time dig www.google.com

; <<>> DiG 9.9.4-RedHat-9.9.4-51.el7 <<>> www.google.com
;; global options: +cmd
;; connection timed out; no servers could be reached

real    0m15.009s
user    0m0.005s
sys     0m0.005s

dig has its own timeout mechanism, timeout(5s) * retries(3) = 15s.

Xiaoming
  • 599
  • 2
  • 8
  • 17
  • Also remember that DNS is not the only source of name->IP mappings. You will find in `/etc/nsswitch.conf` the sources your system is using and in which order. – Patrick Mevzek Jan 20 '18 at 22:40
  • I only found an entry related to dns, "hosts: files dns myhostname". Not sure what does the last one `myhostname` mean. Looks like I can change it to `hosts: dns`, and try again – Xiaoming Jan 22 '18 at 14:49
  • tried again, the result as the same – Xiaoming Jan 22 '18 at 14:59
  • nsswitch.conf has impact on command `curl`, but no effect on `nslookup` or `dig`, looks like `nslookup` and `dig` are using /etc/resolv.conf directly. – Xiaoming Jan 22 '18 at 15:27
  • yes, as these two tools are only **DNS** tools, but any other application that asks the OS for an IP based on a name will call the appropriate `libc` function like `getaddrinfo` which then internally uses `nsswitch.conf`. – Patrick Mevzek Jan 22 '18 at 15:31
  • In all your cases above you should see the full `nslookup` output (run it interactively). Based on its messages, it seems to be clearly trying to reach one nameserver then another, explaining the time it took. You will have to see which nameservers it was querying. – Patrick Mevzek Jan 22 '18 at 15:37

2 Answers2

8

Although this is an old post, I feel like chiming in because it has come up for me more than once, so I want to share it.

One difference to point out is what the application (ie, nslookup or curl) uses for DNS lookups, ie libresolv.so or libbind.so. Seems like nslookup specifically does the latter, so maybe that's why it times out sooner than curl. To see for sure on your system, you should run

strace -o curl.out curl www.google.com
strace -o dig.out dig www.google.com

grep libresolv *.out
grep libbind *.out

and compare.

although cryptic, the strace output should show how long each part is waiting and what underlying system call is doing the work.

shaftdiesel
  • 416
  • 4
  • 7
4

nslookup and similar tools query the DNS directly, whereas curl checks the local /etc/hosts first then queries the DNS. So that could be a clue to the issue at hand.

I've read that nslookup's deprecated. Do you have access to dig?

Neil Anuskiewicz
  • 478
  • 2
  • 12
  • 1
    `dig` should indeed be prefered but will exhibit the same behaviour here – Patrick Mevzek Jan 20 '18 at 22:39
  • Agreed, same behavior here, I just thought I should mention it. Honestly, I don't even know why nslookup is deprecated. – Neil Anuskiewicz Jan 20 '18 at 23:10
  • `dig` is a superset, has more features, and more sane default behavior. See http://packetpushers.net/why-you-should-dig-the-dig-command/ for example for explanations. – Patrick Mevzek Jan 20 '18 at 23:55
  • sine my /etc/hosts doesn't includes the entry 'www.google.com', so the time nslookup takes ~= the time curl takes, right? – Xiaoming Jan 22 '18 at 14:28
  • dig gives too much ns query and response, which I'm not familiar with the output, that's the reason I chose `nslookup`. Will learn more about dig :) – Xiaoming Jan 22 '18 at 14:32
  • 1
    You can use flag `+short` if you want only terse output, without all details. But it will display only what is in the `Answer` section, not the other ones, so sometimes it is less confusing without the flag. – Patrick Mevzek Jan 22 '18 at 15:33