4

Using tcpdump, I am capturing network traffic. I am interested in extracting the actual TCP payload data, i.e. HTTP traffic in my particular case.

I tried to achieve that using scapy, but I only found function remove_payload(). Is there a corresponding counterpart? Or do you know of any other tools that provide such functionality?

Unfortunately, I did not find a satisfactory scapy documentation.

Florian
  • 302
  • 1
  • 4
  • 12

2 Answers2

8

You can read a pcap with Scapy easily with rdpcap, you can then use the Raw (right above TCP) layer of your packets to play with HTTP content:

from scapy.all import *

pcap = rdpcap("my_file.pcap")

for pkt in pcap:
    if Raw in pkt:
        print pkt[Raw]
Jeff Bencteux
  • 1,406
  • 16
  • 27
  • Any suggestions how I can write those bytes to a file rather then to the console? Also, I would not want to output additional newlines. – Florian Jun 29 '16 at 19:43
  • Python `open()` and `write()` deals with I/O. Python's `print` function append a line feed, so if you do not use it and rather write the bytes to a file, it should not be there anymore. – Jeff Bencteux Jun 30 '16 at 06:54
  • I should have been more clear: `write` does not allow the call `f.write(pkt[Raw])` (TypeError: must be convertible to a buffer, not Raw). I could not find a solution to this problem. – Florian Jun 30 '16 at 07:36
  • Found it: `out = open("out.txt", "wb")` and `out.write(pkt[Raw].load)`. While Scapy seems to be really powerful, it's a shame that there is essentially no documentation. – Florian Jun 30 '16 at 07:49
  • There is some [here](http://www.secdev.org/projects/scapy/files/scapydoc.pdf), [here](http://www.secdev.org/projects/scapy/doc/) and a short tutorial [here](https://github.com/secdev/scapy/blob/master/doc/notebooks/Scapy%20in%2015%20minutes.ipynb) but yes, there is a lack of documentation. – Jeff Bencteux Jun 30 '16 at 08:12
  • 1
    This answer only works if packets are delivered in sequence and have no missing sequence numbers. The point of using TCP is for handling exceptions to those situations (which are common in the real world). – Mike Pennington Aug 10 '18 at 11:44
8

In case other users might have similar questions: I ended up using the following script:

infile=infile.pcap
outfile=outfile
ext=txt

rm -f ${outfile}_all.${ext}

for stream in $(tshark -nlr $infile -Y tcp.flags.syn==1 -T fields -e tcp.stream | sort -n | uniq | sed 's/\r//')
do
    echo "Processing stream $stream: ${outfile}_${stream}.${ext}"
    tshark -nlr $infile -qz "follow,tcp,raw,$stream" | tail -n +7 | sed 's/^\s\+//g' | xxd -r -p | tee ${outfile}_${stream}.${ext} >> ${outfile}_all.${ext}
done
Florian
  • 302
  • 1
  • 4
  • 12