0

I want to extract the domain name from DNS packets (request/response) from .pcapng file. the following code what I used

def extract_domain_name(pkt):
     try:
        if pkt.dns.qry_name:
            #print (pkt.ip.src, pkt.dns.qry_name)
            return pkt.dns.qry_name
     except AttributeError as e:
        #ignore packets that aren't DNS Request
        pass
     try:
        if pkt.dns.resp_name:
            print (pkt.ip.src, pkt.dns.resp_name)
            return pkt.dns.resp_name
     except AttributeError as e:
        #ignore packets that aren't DNS Response
        pass
        

def process_pcapng_file(filename):
    # Open the pcapng file
    cap = pyshark.FileCapture(filename)

    # Extract the domain names from the DNS packets
    domains = set()
    for pkt in cap:
        #print (pkt)
        if 'DNS' in pkt:
            #domain = pkt.dns.qry_name
            domain = extract_domain_name(pkt)
            if domain is not None:
                domains.add(domain)

it only extract the domain name from query packets not from query and response. what could the problem? However,

I tried to use if pkt.dns.resp_name: without try: and I got AttributeError

larsks
  • 277,717
  • 41
  • 399
  • 399
  • I can't reproduce this problem; even if I strip out all the `try/except` blocks in `extract_domain_name` the code works without a problem for all the packet captures I've produced locally. Can you provide a link to a pcap file that reproduces this issue? – larsks Feb 20 '23 at 20:48
  • @larsks the following url https://github.com/chenshaojie-happy/DNS-covert-channel-detection-method-using-the-LSTM-model/blob/main/datasets/det/det_a_up.pcapng – bany salameh Feb 21 '23 at 08:43

1 Answers1

0

Thanks for posting the sample capture; that helps.

I think the reason your code works for me but not for you is that in your sample capture, the only replies are SERVFAIL messages:

$ tcpdump -nn -r sample.pcap port domain | awk '{print $7}' | sort | uniq -c
  38365 A?
  38393 ServFail

It looks like for SERVFAIL messages, pkt.dns will not have a resp_name attribute.

it only extract the domain name from query packets not from query and response

Just to be explicit: in your sample capture, there are no valid query responses, so pkt.dns.resp_name is never defined.

There are a couple of things to think about here:

  1. If your logic is effectively:

    if pkt.dns.qry_name:
      return pkt.dns.qry_name
    if pkt.dns.resp_name:
      return pkt.dns.resp_name
    

    You will never reach the second if statement because a query response also includes the original query (so you will always return pkt.dns.qry_name).

  2. Do you really care about resp_name? In all cases, either pkt.dns.resp_name will match pkt.dns.qry_name, or it won't exist.

It seems to me you could simplify your code to:

def process_pcapng_file(filename):
    cap = pyshark.FileCapture(filename)

    return set(
        pkt.dns.qry_name
        for pkt in cap
        if pkt.highest_layer == "DNS" and pkt.dns.qry_name
    )

But if you want to use your existing extract_domain_name function, you'll need to reverse the checks for resp_name and qry_name:

def extract_domain_name(pkt):
    try:
        if pkt.dns.resp_name:
            return pkt.dns.resp_name
    except AttributeError:
        pass

    try:
        if pkt.dns.qry_name:
            return pkt.dns.qry_name
    except AttributeError:
        pass

You can make that a little shorter by replacing the exception handling with hasattr:

def extract_domain_name(pkt):
    if hasattr(pkt.dns, "resp_name") and pkt.dns.resp_name:
        return pkt.dns.resp_name

    if hasattr(pkt.dns, "qry_name") and pkt.dns.qry_name:
        return pkt.dns.qry_name
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
larsks
  • 277,717
  • 41
  • 399
  • 399
  • Thanks for your replay. You are right. All responses in my file are SERVFAIL so the `dns.resp_name` attribute not working – bany salameh Mar 01 '23 at 10:57
  • I have pcapng file with 604510 DNS packets 317926 packets of them are query packets. however, it extracts only 9000 domain name from it. no errors shown. why this could happen? @larsks – bany salameh Mar 01 '23 at 12:02
  • I'm not sure. The best way to diagnose the problem is to identify a packet that it misses, and then add a conditional breakpoint to your code and step through it to see what's happening. I think Wireshark will allow you to create a minimal pcap file with only non-matching packets to use for testing. – larsks Mar 01 '23 at 13:15