0

I'm using pyshark and are trying to print out JSON. This is my code:

import pyshark
import json

capture = pyshark.LiveCapture(interface='eth0', bpf_filter='http', use_json=True)

for packet in capture.sniff_continuously(packet_count=10):
    print(json.loads(str(packet)))

But I'm getting the error:

    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Also, when simply running print(packet) it isn't JSON.

UPDATE

I've tried with this:

cmd = 'tshark -i en0 -f http -T json -x -l --no-duplicate-keys'
args = shlex.split(cmd)
tshark = subprocess.Popen(args, stdout=subprocess.PIPE)
for line in io.TextIOWrapper(tshark.stdout, encoding="utf-8"):
    print("test: %s" % line.rstrip())

But that print out every single line of the JSON object instead of one combined object, guess it's because of the pipe. Can this be changed so I have the actual JSON object per packet?

Alfred Balle
  • 1,135
  • 4
  • 16
  • 32
  • As your error shows, the problem is `packet` that is `None`, not even a `string` type. So you should first check the why `capture.sniff_continuously(packet_count=10)` is returning or yielding `None` value. – shayan rok rok Mar 12 '20 at 19:40
  • Adding before `print` the following `if packet is not None:` throws the same error. – Alfred Balle Mar 12 '20 at 19:53
  • Anyway, I think my problem in general is understanding how to correctly parse JSON from `pyshark` when using its `use_json=True`. – Alfred Balle Mar 12 '20 at 19:55

1 Answers1

0

The reason you can't decode as JSON is because the packet string isn't a JSON:

# print_json.py
import pyshark
import json

capture = pyshark.LiveCapture(interface='en0', use_json=True)
for packet in capture.sniff_continuously(packet_count=1):
     print(packet)

Output:

$ python print_json.py
Packet (Length: 78)
Layer ETH:
    dst:
        ig: 0
        eth.dst_resolved: cc:65:ad:da:39:70
        dst_resolved: cc:65:ad:da:39:70
        lg: 0
...

Per the relevant PyShark file, the param use_json is experimental:

:param use_json: Uses tshark in JSON mode (EXPERIMENTAL). It is a good deal faster than XML

Using tshark with the -T json flag will give you a json faster (and is what the use_json is based on).

Using Tshark directly, instead

You should consider using scapy or tshark directly. In this example, we print one packet's ethernet layer by calling tshark with subprocess:

# print_eth_json.py
import json
import subprocess as sp
improt pprint

json_str = sp.check_output("tshark -c 2 -T json".split(' ')).decode('utf-8')
tshark_pkts = json.loads(json_str)
# Transform tshark json into a scapy-like packet-json list.
pkts_json = [pkt['_source']['layers'] for pkt in tshark_pkts]
pprint.pprint(pkts_json[0]['eth'])

And then running it:

$ python print_eth_json.py
Capturing on 'Wi-Fi: en0'
2
55 packets dropped from Wi-Fi: en0
{'eth.dst': 'cc:65:ad:da:39:70',
 'eth.dst_tree': {'eth.addr': 'cc:65:ad:da:39:70',
                  'eth.addr_resolved': 'ArrisGro_da:39:70',
                  'eth.dst_resolved': 'ArrisGro_da:39:70',
                  'eth.ig': '0',
...
Ross Jacobs
  • 2,962
  • 1
  • 17
  • 27
  • Hi Ross, thank you for detailed answer. I'm not sure I fully understand what you mean by `Using tshark with the -T json flag will give you a json faster (and is what the use_json is based on).`? So the `-T json` is for handling `json` or is the `-T json` for encoding packets into json-formats? I need a solution which will work on high loads, I'm not sure how your `subprocess` works for multiple packets in high load, and actually how to apply it - can you give quick example, or is `scapy` better for this? I'm not going to manipulate the packet, important is I keep it's original structure. – Alfred Balle Mar 13 '20 at 06:08
  • Maybe combine with a JSON stream reader, but not sure it is the right way and overhead maybe too big? – Alfred Balle Mar 13 '20 at 13:56
  • Overhead of what is too big? If we’re talking about subprocess, it’s a part the Python Standard Library and is a wrapper for C’s Popen. – Ross Jacobs Mar 13 '20 at 15:42
  • Agree. But I was thinking of a JSON stream reader, which seems to be required for getting the single JSON object per packet. – Alfred Balle Mar 13 '20 at 16:46
  • If you think I answered that question you asked, please mark it as such. It sounds like you want to ask a separate question about streaming packets in json form. It also sounds like you have requirements, which you shouldn’t detail in that question. – Ross Jacobs Mar 13 '20 at 19:57