Speed up dig -x in bash script

Question

I have to run as an exercise at my university a bash script to reverse lookup all their DNS entries for a B class network block they own.

This is the fastest I have got but takes forever. Any help optimising this code?

#!/bin/bash
network="a.b"

CMD=/usr/bin/dig

for i in $(seq 1 254); do

    for y in $(seq 1 254); do
        answer=`$CMD -x $network.$i.$y +short`; 
        echo $network.$i.$y ' resolves to ' $answer >> hosts_a_b.txt;
    done
done

Think of **GNU Parallel** when you want to do lots of stuff in parallel... — Mark Setchell, Mar 06 '19 at 16:23
Doing more than 64000 DNS lookups is going to take a while no matter how you do it. — Barmar, Mar 06 '19 at 16:45

Charles Duffy · Accepted Answer · 2019-03-18T02:09:19.417

0

Using GNU xargs to run 64 processes at a time might look like:

#!/usr/bin/env bash

lookupArgs() {
  for arg; do
    # echo entire line together to ensure atomicity
    echo "$arg resolves to $(dig -x "$arg" +short)"
   done
}
export -f lookupArgs

network="a.b"
for (( x=1; x<=254; x++ )); do
  for (( y=1; y<=254; y++ )); do
    printf '%s.%s.%s\0' "$network" "$x" "$y"
  done
done | xargs -0 -P64 bash -c 'lookupArgs "$@"' _ >hosts_a_b.txt

Note that this doesn't guarantee order of output (and relies on the lookupArgs function doing one write() syscall per result) -- but output is sortable so you should be able to reorder. Otherwise, one could get ordered output (and ensure atomicity of results) by switching to GNU parallel -- a large perl script, vs GNU xargs' small, simple, relatively low-feature implementation.

edited Mar 18 '19 at 02:09

answered Mar 06 '19 at 17:11

Charles Duffy

280,126
43
390
441

Thank you so much. It is significantly faster. The same scan took more than an hour with my code but only 26 mins with your modification – The Shidoshi Mar 06 '19 at 19:01
It is a pretty dangerous solution: In this case it will work (reverse names are typically much shorter than 1000 bytes), but you are playing with fire. Longer lines (>1008 bytes) or multiple writes would open you to raceconditions with half-line mixing. Not stressing the limitations makes it seem as if you are unaware of them: http://mywiki.wooledge.org/BashPitfalls#Non-atomic_writes_with_xargs_-P – Ole Tange Mar 18 '19 at 02:00
You've seen me stress those limitations in other situations where they *can* be reasonably expected to be hit -- this isn't our first discussion on the topic. That said, a bit more emphasis could be reasonably called for here, and I've edited appropriately. – Charles Duffy Mar 18 '19 at 02:07

Martin Kealey · Answer 2 · 2023-05-25T01:29:01.433

A fair chunk of the time is just starting the dig command for every single address.

Better would be to handle a reasonable number of addresses in one command, and then post-process that to produce the output you desire. (If you don't like the raw output, add the sed command I've given at the end of this answer.)

For a /16 network I suggest something like this:

    #!/bin/bash
    N=192.168.
    printf %s\\n -x"$N"{0..255}.{0..255} |
    xargs -r -n64 -P64 dig +noall +ans +nocl +nottl

The -n64 means that each invocation of dig will output around 3800 bytes depending on how long the resulting names are. As long as the entire output of each dig is less than 4096 bytes, it will be written as a single write() syscall.

If you have long domain names causing interleaving, the simplest fix is to reduce the -n option, but that will make it somewhat slower.

Another way to mitigated this is to have each invocation of dig write to a separate output file, and then combine them at the end. For example:

    #!/bin/bash
    d="/tmp/tmpdir$$"
    mkdir "$d"
    N=192.168.
    p=0 w=64
    for (( c=0 ; c < 256 ; ++c )) do
        (( ++p <= w )) || wait -n
        dig +noall +ans +nocl +nottl "$N$c".{0..255} > "$d/$c" &
    done
    wait
    cat "$d"/*

Lastly, if you have administrative access to the authoritative nameserver, then you could configure the nameserver to allow zone transfers to a host of your choosing. Then you could get everything in a few seconds using a single command:

    #!/bin/bash
    dig @ns.your.domain. +noall +ans +nocl +nottl axfr 168.192.in-addr.arpa

The commands I've given above all output the raw dig format; you can convert this to the requested format using a filter like this:

    #!/bin/bash
    (command as above ...) |
    sed 's/^\([0-9.]*\)\..*[[:space:]]/\1\t/
         s/\.$//
         s/^\(.*\)\.\(.*\)\.\(.*\)\.\(.*\)\t/\4.\3.\2.\1\t/
         s/\t/  resolves to  /'

Speed up dig -x in bash script

2 Answers2