1

enter image description hereThere are few wireshark .pcap files. I need to separate each .pcap to incoming and outgoing traffic (by giving source and destination mac addresses) and these separated files have to get written into two different folders namely Incoming and Outgoing. The output files (files that got separated as incoming and outgoing) have to get the same name as input files and need to get written to .csv files. I tried the below code, but not working . Any help is greatly appreciated. Thanks

import os
import csv
startdir= '/root/Desktop/Test'
suffix= '.pcap'
for root,dirs, files, in os.walk(startdir):
    for name in files:
        if name.endswith(suffix):
            filename=os.path.join(root,name)
            cmdOut = 'tshark -r "{}" -Y "wlan.sa==00:00:00:00:00:00 && wlan.da==11:11:11:11:11:11" -T fields -e frame.time_delta_displayed -e frame.len -E separator=, -E header=y > "{}"'.format(filename,filename)
            cmdIn = 'tshark -r "{}" -Y "wlan.sa==11:11:11:11:11:11 && wlan.da==00:00:00:00:00:00" -T fields -e frame.time_delta_displayed -e frame.len -E separator=, -E header=y > "{}"'.format(filename,filename)
            #os.system(cmd1)
            #os.system(cmd2)

            with open('/root/Desktop/Incoming/', 'w') as csvFile:
                writer = csv.writer(csvFile)
                writer.writerows(os.system(cmdIn))

            with open('/root/Desktop/Outgoing/', 'w') as csvFile:
                writer = csv.writer(csvFile)
                writer.writerows(os.system(cmdOut))

            csvFile.close()
Hasa
  • 145
  • 2
  • 10
  • Please add your code as text instead of image – kuro Jul 03 '19 at 10:52
  • You are not using proper filename for incoming and outgoing csv files. Also, why do you need `csvFile.close()`? – kuro Jul 03 '19 at 10:54
  • @kuro sorry, I am new to python, I am not sure where I'm getting wrong. what should I write for filenames? Thanks – Hasa Jul 03 '19 at 11:01
  • @kuro I add my code as text. – Hasa Jul 03 '19 at 11:14
  • Using string concatenation to form commands is a serious security problem (which means it's hard to use `os.system()` securely). If you switch to `subprocess.run()` without `shell=True`, it's much easier to do the right thing, since you can just pass everything but the redirection as a list element. – Charles Duffy Jul 03 '19 at 11:18
  • ...that said, maybe you might just directly stream the output from the `tshark` commands into your script, and not bother writing them to temporary files at all? – Charles Duffy Jul 03 '19 at 11:21
  • BTW, generally, we ask that you describe *exactly* how your code is broken (showing the exact error message, exception, or behavior), rather than just describing it as "not working". – Charles Duffy Jul 03 '19 at 11:22
  • @CharlesDuffy where should I change the code to switch to subprocess.run()? – Hasa Jul 03 '19 at 11:23
  • @CharlesDuffy I need to get the output to csv file. but my input .pcap file gets changed when I run this code. – Hasa Jul 03 '19 at 11:46
  • Have you considered using [`pyshark`](https://github.com/KimiNewt/pyshark)? It wraps the tshark commands in a nice python API. – Alex Jul 03 '19 at 19:13

1 Answers1

0

A correct implementation might look more like:

import csv
import os
import subprocess

startdir = 'in.d'    # obviously, people other than you won't have /root/Desktop/test
outdir = 'out.d'
suffix = '.pcap'

def decode_to_file(cmd, in_file, new_suffix):
    proc = subprocess.Popen(cmd, stdout=subprocess.PIPE)
    fileName = outdir + '/' + in_file[len(startdir):-len(suffix)] + new_suffix
    os.makedirs(os.path.dirname(fileName), exist_ok=True)
    csv_writer = csv.writer(open(fileName, 'w'))
    for line_bytes in proc.stdout:
        line_str = line_bytes.decode('utf-8')
        csv_writer.writerow(line_str.strip().split(','))

for root, dirs, files in os.walk(startdir):
    for name in files:
        if not name.endswith(suffix):
            continue
        in_file = os.path.join(root, name)
        cmdCommon = [
            'tshark', '-r', in_file,
            '-T', 'fields',
            '-e', 'frame.time_delta_displayed',
            '-e', 'frame.len',
            '-E', 'separator=,',
            '-E', 'header=y',
        ]

        decode_to_file(
            cmd=cmdCommon + ['-Y', 'wlan.sa==00:00:00:00:00:00 && wlan.da==11:11:11:11:11:11'],
            in_file=in_file,
            new_suffix='.out.csv'
        )
        decode_to_file(
            cmd=cmdCommon + ['-Y', 'wlan.sa==11:11:11:11:11:11 && wlan.da==00:00:00:00:00:00'],
            in_file=in_file,
            new_suffix='.in.csv'
        )

Note:

  • We don't use os.system(). (This wouldn't have ever worked, since it returns a numeric exit status, not strings in a format you can write to a CSV file).
  • We're not needing to generate any temporary files; we can read directly into our Python code from the stdout of the tshark subprocess.
  • We construct our output file name by modifying the input file name (replacing its extension with .out.csv and .in.csv, respectively).
  • Because writerow() requires an iterable, we can generate one by splitting by line.

Note that I'm not completely clear why you wanted to use the Python CSV module at all, since the fields output appears to already be CSV, so one could also just redirect the output straight to a file with no other processing.

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
  • Thank you so much. I run the code. But I am getting below error. I google to fix it, but could not find a solution. with csv.writer(open(out FileName, 'w')) as our Csv: attribute error __enter__ – Hasa Jul 03 '19 at 13:24
  • So the version of Python you're using doesn't have csv objects as context managers. Just store it as a regular variable instead, and close it explicitly. I've edited to demonstrate that change. – Charles Duffy Jul 03 '19 at 13:59
  • I am using Pycharm. The project interpreter is Python 3.7. I changed the code as you have done. But now I'm getting these errors. tshark: output fields were specified with "-e", but "-Tek, -Tfields, -Tjson or -Tpdml was not specified. to check I ran this same tshark command in linux terminal, it gave this same error. Also when I run this python code it gives another error. AttributeError: '_csv.writer' object has no attribute 'close' – Hasa Jul 03 '19 at 17:18
  • I typed the tshark command in linux terminal with -T fields instead of -T tabs, then it works fine. if we do this change in the python code will it be a problem, because you have stated using a generator to further split each line on tabs is easy? – Hasa Jul 03 '19 at 17:23
  • `tabs` is valid in wireshark 3.0.1; which version are you running? – Charles Duffy Jul 03 '19 at 18:55
  • ....the reason to use `tabs` instead of `fields` is that tabs are less likely than general whitespace to show up inside of data values, so your code is less likely to have spurious column divisions if you use them. – Charles Duffy Jul 03 '19 at 19:13
  • Thanks for all the replies Charles. Its wireshark 2.6.1. when we use tabs I'm getting tshark: output fields were specified with "-e", but "-Tek, -Tfields, -Tjson or -Tpdml was not specified. but when I use fields instead of tabs, then getting a error at .writerows(line.strip().split('\t') for line in inProc.stdout) – Hasa Jul 03 '19 at 20:59
  • I changed the code to.writerows(line.strip().split(',') for line in inProc.stdout), by giving fields then getting type error: a bytes like object is required, not 'str' – Hasa Jul 03 '19 at 21:02
  • *Always*, in Python, to get a string to be a byte object, you `encode()` it. (I don't think you specified Python 3 in the question; in Python 2.x, the unicode handling is much worse, but in a way that happens to make this easier). – Charles Duffy Jul 03 '19 at 21:41
  • ...so you might end up with something like `inCsv.writerows(line.strip().encode('utf-8').split(b'\t') for line in inProc.stdout)`. Though I'd need to know what `fields` output mode looks like to know what to replace the `\t` with if not using `tabs`. – Charles Duffy Jul 03 '19 at 21:45
  • when i use -T fields the output csv files look like this (this is when I typed the command in linux terminal not in python code): values separated by , and written as rows one after the other. I have add an image of the output https://i.stack.imgur.com/9HasT.png – Hasa Jul 04 '19 at 08:53
  • Actually, that's the output I want to get as well. when we use tabs instead of fields, csv files gets created but all the files are empty. – Hasa Jul 04 '19 at 09:03
  • Why don't you post a zip file of pcaps somewhere so I can test my own code instead of needing to wait for you to report results back? – Charles Duffy Jul 04 '19 at 11:45
  • https://www.filemail.com/d/raisbwzezhpmcoj I have post the zip file – Hasa Jul 04 '19 at 12:05
  • It would be great if I can get the output as in the image I sent earlier (separated by commas) , because after separating the files to incoming and outgoing traffic, next I have to perform some calculations. For that if I have the output as in the sent image its easy. Thanks – Hasa Jul 04 '19 at 12:53
  • Charles, I hope you were able to find the zip file. And I noticed you have removed outCsv.close() and inCsv.close(). I remove that and ran the code. code is working without errors. but getting a message tshark: output fields were specified with "-e", but "-Tek, -Tfields, -Tjson or -Tpdml was not specified. And the csv files that gets created are all empty. Can we use fields instead of tabs? what should I replace the \t with ? Thanks – Hasa Jul 04 '19 at 21:21
  • sorry Charles. I didn't know that. It would be great if you can look into that when you are back to work. Thanks. Enjoy your holiday. – Hasa Jul 04 '19 at 21:55
  • I did take a look, and fixed some things. The output files are indeed empty, but that's because the pcap files don't really use addresses `00:00:00:00:00:00` and `11:11:11:11:11:11`; the command `tshark -r in.d/Mycaps/Folder1/ts1.pcap -Y 'wlan.sa==00:00:00:00:00:00 && wlan.da==11:11:11:11:11:11'` similarly has no output. BTW, note that this was much more involved service than it's usually appropriate to expect on Stack Overflow -- we generally answer one narrow, specific problem per question, rather than fixing code that has multiple things wrong with it. – Charles Duffy Jul 05 '19 at 01:07
  • I modified your code so this time it will accept csv file as input not .pcap files. And based on column values the data will get separated to two files. (same scenario as in this question, only difference is input file type) I tried https://stackoverflow.com/questions/57054039/csv-file-into-seperate-files-python-3-7 this. but not working. If possible please let me know where I got wrong. Thanks a million. – Hasa Jul 16 '19 at 09:25