4

I am writing a tool in python (platform is linux), one of the tasks is to capture a live tcp stream and to apply a function to each line. Currently I'm using

import subprocess
proc = subprocess.Popen(['sudo','tcpflow', '-C', '-i', interface, '-p', 'src', 'host', ip],stdout=subprocess.PIPE)

for line in iter(proc.stdout.readline,''):
    do_something(line)

This works quite well (with the appropriate entry in /etc/sudoers), but I would like to avoid calling an external program.

So far I have looked into the following possibilities:

  • flowgrep: a python tool which looks just like what I need, BUT: it uses pynids internally, which is 7 years old and seems pretty much abandoned. There is no pynids package for my gentoo system and it ships with a patched version of libnids which I couldn't compile without further tweaking.

  • scapy: this is a package manipulation program/library for python, I'm not sure if tcp stream reassembly is supported.

  • pypcap or pylibpcap as wrappers for libpcap. Again, libpcap is for packet capturing, where I need stream reassembly which is not possible according to this question.

Before I dive deeper into any of these libraries I would like to know if maybe someone has a working code snippet (this seems like a rather common problem). I'm also grateful if someone can give advice about the right way to go.

Thanks

Community
  • 1
  • 1
PiQuer
  • 2,383
  • 25
  • 29

2 Answers2

2

Jon Oberheide has led efforts to maintain pynids, which is fairly up to date at: http://jon.oberheide.org/pynids/

So, this might permit you to further explore flowgrep. Pynids itself handles stream reconstruction rather elegantly.See http://monkey.org/~jose/presentations/pysniff04.d/ for some good examples.

Matthew G
  • 36
  • 1
  • Thanks for the link. I totally missed this newer vesion of pynids, I could compile this version. But first tests indicate that with libnids it is only possible to capture newly established tcp connections because of the way the callback function works (I need to capture an ongoing stream). Well, this is outside the scope of this question... – PiQuer Jan 06 '12 at 16:27
  • Pynids is now archived. does anyone have an idea about other python libraries that implements TCP reassembly functionality? – Radwa Ahmed May 12 '23 at 12:43
1

Just as a follow-up: I abandoned the idea to monitor the stream on the tcp layer. Instead I wrote a proxy in python and let the connection I want to monitor (a http session) connect through this proxy. The result is more stable and does not need root privileges to run. This solution depends on pymiproxy.

This goes into a standalone program, e.g. helper_proxy.py

from multiprocessing.connection import Listener
import StringIO
from httplib import HTTPResponse
import threading
import time
from miproxy.proxy import RequestInterceptorPlugin, ResponseInterceptorPlugin, AsyncMitmProxy

class FakeSocket(StringIO.StringIO):
    def makefile(self, *args, **kw):
        return self

class Interceptor(RequestInterceptorPlugin, ResponseInterceptorPlugin):
    conn = None
    def do_request(self, data):
        # do whatever you need to sent data here, I'm only interested in responses
        return data
    def do_response(self, data):
        if Interceptor.conn:   # if the listener is connected, send the response to it
            response = HTTPResponse(FakeSocket(data))
            response.begin()
            Interceptor.conn.send(response.read())
        return data

def main():
    proxy = AsyncMitmProxy()
    proxy.register_interceptor(Interceptor)
    ProxyThread = threading.Thread(target=proxy.serve_forever)
    ProxyThread.daemon=True
    ProxyThread.start()
    print "Proxy started."
    address = ('localhost', 6000)     # family is deduced to be 'AF_INET'
    listener = Listener(address, authkey='some_secret_password')
    while True:
        Interceptor.conn = listener.accept()
        print "Accepted Connection from", listener.last_accepted
        try:
            Interceptor.conn.recv()
        except: time.sleep(1)
        finally:
            Interceptor.conn.close()

if __name__ == '__main__':
    main()

Start with python helper_proxy.py. This will create a proxy listening for http connections on port 8080 and listening for another python program on port 6000. Once the other python program has connected on that port, the helper proxy will send all http replies to it. This way the helper proxy can continue to run, keeping up the http connection, and the listener can be restarted for debugging.

Here is how the listener works, e.g. listener.py:

from multiprocessing.connection import Client

def main():
    address = ('localhost', 6000)
    conn = Client(address, authkey='some_secret_password')
    while True:
        print conn.recv()

if __name__ == '__main__':
    main()

This will just print all the replies. Now point your browser to the proxy running on port 8080 and establish the http connection you want to monitor.

PiQuer
  • 2,383
  • 25
  • 29