Using Python, how can I return a JSON file after performing functions on it and converting it to a dictionary?

Question

I am currently working with netflow data in a json file. My job is to parse the json file and perform specific actions on the data within it. After doing so, I'm creating a new file and adding each new updated json object to it. What's happening is that they are all on the same line. Is there any way to get each json object/line onto a different line? The original file given to me was like this as well, and it just looks neater. Thanks!

EDIT:

What I want:

    {"@timestamp":"2015-05-18T19:26:13.000Z","netflow":{"version":"9","flow_seq_num":"188189","flowset_id":"257","last_switched":"2015-05-15T14:28:01.999Z","first_switched":"2015-05-15T14:27:37.999Z","in_bytes":"4800","in_pkts":"2","input_snmp":"5","output_snmp":"4","ipv4_src_addr":"10.10.1.4","ipv4_dst_addr":"192.1.44.182","protocol":"6","src_tos":"2","dst_tos":"0","l4_src_port":"443","l4_dst_port":"12080","flow_sampler_id":"0","ipv4_next_hop":"10.10.1.5","dst_mask":"37","src_mask":"21","tcp_flags":"27","direction":"1"},"@version":"1","host":"192.168.19.202","src_host_name":"","dst_host_name":"","app_name":"","tcp_flags_str":"","dscp":"","highval":"","src_blacklisted":"0","dst_blacklisted":"0","invalid_ToS":"0","bytes_per_packet":2400,"tcp_nominal_payload":"0","malformed_ip":"0","empty_tcp":"0","short_tcp_handshake":"0","icmp_malformed_packets":"0","snort_attack_flow":"0","empty_udp":"0","short_udp":"0","short_tcp_rstack":"0","short_tcp_pansf":"0","short_tcp_synack":"0","short_tcp_synrst":"0","short_tcp_finack":"0","short_tcp_pna":"0","non_unicast_src":"0","multicast":"0","broadcast":"0","network":"0","tcp_urg":"0","land_attack":"0","short_tcp_ack":"0","tcp_synfin":"0","tcp_fin":"0","malformed_tcp":"1","tcp_xmas":"0","udp_echo_req":"0","tcp_null":"0","tcp_syn":"0","malformed_udp":"0","tcp_rst":"0","icmp_request":"0","icmp_response":"0","icmp_port_unreachable":"0","icmp_host_unreachable":"0","icmp_unreachable_for_Tos":"0","icmp_network_unreachable":"0","icmp_redirects":"0","icmp_time_exceeded_flows":"0","icmp_parameter_problem_flows":"0","icmp_trace_route":"0","icmp_datagram":"0","udp_echo_chargen_broadcast":"0","udp_chargen_echo_broadcast":"0","icmp_src_quench":"0","icmp_proto_unreachable":"0","udp_echo_broadcast":"0","udp_echo_rsp":"0", "hi": 10}
    {"@timestamp":"2015-05-18T19:26:13.000Z","netflow":{"version":"9","flow_seq_num":"188189","flowset_id":"257","last_switched":"2015-05-15T14:28:01.999Z","first_switched":"2015-05-15T14:27:37.999Z","in_bytes":"77","in_pkts":"2","input_snmp":"7","output_snmp":"2","ipv4_src_addr":"192.1.44.179","ipv4_dst_addr":"10.10.1.8","protocol":"6","src_tos":"0","dst_tos":"2","l4_src_port":"12192","l4_dst_port":"443","flow_sampler_id":"0","ipv4_next_hop":"10.10.1.7","dst_mask":"12","src_mask":"37","tcp_flags":"24","direction":"0"},"@version":"1","host":"192.168.19.202","src_host_name":"","dst_host_name":"","app_name":"","tcp_flags_str":"","dscp":"","highval":"","src_blacklisted":"0","dst_blacklisted":"0","invalid_ToS":"0","bytes_per_packet":38,"tcp_nominal_payload":"0","malformed_ip":"0","empty_tcp":"0","short_tcp_handshake":"0","icmp_malformed_packets":"0","snort_attack_flow":"0","empty_udp":"0","short_udp":"0","short_tcp_rstack":"0","short_tcp_pansf":"0","short_tcp_synack":"0","short_tcp_synrst":"0","short_tcp_finack":"0","short_tcp_pna":"0","non_unicast_src":"0","multicast":"0","broadcast":"0","network":"0","tcp_urg":"0","land_attack":"0","short_tcp_ack":"0","tcp_synfin":"0","tcp_fin":"0","malformed_tcp":"1","tcp_xmas":"0","udp_echo_req":"0","tcp_null":"0","tcp_syn":"0","malformed_udp":"0","tcp_rst":"0","icmp_request":"0","icmp_response":"0","icmp_port_unreachable":"0","icmp_host_unreachable":"0","icmp_unreachable_for_Tos":"0","icmp_network_unreachable":"0","icmp_redirects":"0","icmp_time_exceeded_flows":"0","icmp_parameter_problem_flows":"0","icmp_trace_route":"0","icmp_datagram":"0","udp_echo_chargen_broadcast":"0","udp_chargen_echo_broadcast":"0","icmp_src_quench":"0","icmp_proto_unreachable":"0","udp_echo_broadcast":"0","udp_echo_rsp":"0", "yes":10}
    {"@timestamp":"2015-05-18T19:59:59.000Z","netflow":{"version":"9","flow_seq_num":"189654","flowset_id":"257","last_switched":"2015-05-15T14:25:09.999Z","first_switched":"2015-05-15T14:24:45.999Z","in_bytes":"8400","in_pkts":"1","input_snmp":"7","output_snmp":"2","ipv4_src_addr":"10.10.1.2","ipv4_dst_addr":"192.1.109.32","protocol":"6","src_tos":"2","dst_tos":"0","l4_src_port":"443","l4_dst_port":"12816","flow_sampler_id":"0","ipv4_next_hop":"10.10.1.3","dst_mask":"45","src_mask":"3","tcp_flags":"19","direction":"1"},"@version":"1","host":"192.168.19.202","src_host_name":"","dst_host_name":"","app_name":"","tcp_flags_str":"","dscp":"","highval":"","src_blacklisted":"0","dst_blacklisted":"0","invalid_ToS":"0","bytes_per_packet":8400,"tcp_nominal_payload":"0","malformed_ip":"0","empty_tcp":"0","short_tcp_handshake":"0","icmp_malformed_packets":"0","snort_attack_flow":"0","empty_udp":"0","short_udp":"0","short_tcp_rstack":"0","short_tcp_pansf":"0","short_tcp_synack":"0","short_tcp_synrst":"0","short_tcp_finack":"0","short_tcp_pna":"0","non_unicast_src":"0","multicast":"0","broadcast":"0","network":"0","tcp_urg":"0","land_attack":"0","short_tcp_ack":"0","tcp_synfin":"0","tcp_fin":"0","malformed_tcp":"1","tcp_xmas":"0","udp_echo_req":"0","tcp_null":"0","tcp_syn":"0","malformed_udp":"0","tcp_rst":"0","icmp_request":"0","icmp_response":"0","icmp_port_unreachable":"0","icmp_host_unreachable":"0","icmp_unreachable_for_Tos":"0","icmp_network_unreachable":"0","icmp_redirects":"0","icmp_time_exceeded_flows":"0","icmp_parameter_problem_flows":"0","icmp_trace_route":"0","icmp_datagram":"0","udp_echo_chargen_broadcast":"0","udp_chargen_echo_broadcast":"0","icmp_src_quench":"0","icmp_proto_unreachable":"0","udp_echo_broadcast":"0","udp_echo_rsp":"0", "no":10}
    {"@timestamp":"2015-05-18T19:33:58.000Z","netflow":{"version":"9","flow_seq_num":"188525","flowset_id":"257","last_switched":"2015-05-15T14:27:22.999Z","first_switched":"2015-05-15T14:26:58.999Z","in_bytes":"8300","in_pkts":"2","input_snmp":"3","output_snmp":"6","ipv4_src_addr":"10.10.1.6","ipv4_dst_addr":"192.1.59.124","protocol":"6","src_tos":"2","dst_tos":"0","l4_src_port":"80","l4_dst_port":"12660","flow_sampler_id":"0","ipv4_next_hop":"10.10.1.4","dst_mask":"28","src_mask":"13","tcp_flags":"19","direction":"1"},"@version":"1","host":"192.168.19.202","src_host_name":"","dst_host_name":"","app_name":"","tcp_flags_str":"","dscp":"","highval":"","src_blacklisted":"0","dst_blacklisted":"0","invalid_ToS":"0","bytes_per_packet":4150,"tcp_nominal_payload":"0","malformed_ip":"0","empty_tcp":"0","short_tcp_handshake":"0","icmp_malformed_packets":"0","snort_attack_flow":"0","empty_udp":"0","short_udp":"0","short_tcp_rstack":"0","short_tcp_pansf":"0","short_tcp_synack":"0","short_tcp_synrst":"0","short_tcp_finack":"0","short_tcp_pna":"0","non_unicast_src":"0","multicast":"0","broadcast":"0","network":"0","tcp_urg":"0","land_attack":"0","short_tcp_ack":"0","tcp_synfin":"0","tcp_fin":"0","malformed_tcp":"1","tcp_xmas":"0","udp_echo_req":"0","tcp_null":"0","tcp_syn":"0","malformed_udp":"0","tcp_rst":"0","icmp_request":"0","icmp_response":"0","icmp_port_unreachable":"0","icmp_host_unreachable":"0","icmp_unreachable_for_Tos":"0","icmp_network_unreachable":"0","icmp_redirects":"0","icmp_time_exceeded_flows":"0","icmp_parameter_problem_flows":"0","icmp_trace_route":"0","icmp_datagram":"0","udp_echo_chargen_broadcast":"0","udp_chargen_echo_broadcast":"0","icmp_src_quench":"0","icmp_proto_unreachable":"0","udp_echo_broadcast":"0","udp_echo_rsp":"0", "bye": 10}

What is happening currently: Instead of being on separate lines, each JSON object is just right after another (so it's one big line when you open it up to read it). I want it to be separated line by line, like above. I would have shown the current output but I was going way over my character limit.

Hope this helps!

@ColonelThirtyTwo Not exactly! Please refer to my edits above :) — Ria, Jun 03 '15 at 16:41
So you have a bunch of JSON objects concatenated with each other (`{"foo": 1}{"foo": 2}...`), and you want to separate them? — Colonel Thirty Two, Jun 03 '15 at 16:44
That's tricky. Most JSON parsers can't deal with that type of data. You could count the number of braces open, and once all of them have been closed, separate the object, but you have to ignore braces inside of strings. — Colonel Thirty Two, Jun 03 '15 at 16:51

Martin Hallén · Answer 1 · 2015-06-03T18:15:05.897

It's difficult to know exactly know what your problem is without your code, but I guess you want to use json.dumps.

Example usage from docs:

>>> import json
>>> print json.dumps({'4': 5, '6': 7},
...                  indent=4, separators=(',', ': '))
{
    "4": 5,
    "6": 7
}

In your example, this would look something like this:

data = [{"@timestamp":"2015-05-18T19:26:13.000Z","netflow":{"version":"9"}},{"@timestamp":"2015-05-18T19:26:13.000Z","netflow":{"version":"9"}}]

with open("test.json", "w"):
    json.dumps(data, indent=1)

# test.json
[
 {
  "@timestamp": "2015-05-18T19:26:13.000Z",
  "netflow": {
   "version": "9"
  }
 },
 {
  "@timestamp": "2015-05-18T19:26:13.000Z",
  "netflow": {
   "version": "9"
  }
 }
]

EDIT: If you want each json-object on their own line, you could use:

with open("test.json", "w+") as f:
    for line in data:
        f.write(str(line) + "\n")

#test.json
{'@timestamp': '2015-05-18T19:26:13.000Z', 'netflow': {'version': '9'}}
{'@timestamp': '2015-05-18T19:26:13.000Z', 'netflow': {'version': '9'}}

Note that this is not valid json, as they have to be in an array. You might also want to replace '-quotes with "-quotes. That can be done with .replace("'", '"')

I actually don't want my json objects to be nested! I just want each separate json object to be on different lines. — Ria, Jun 03 '15 at 18:00
The "\n" worked! I can't believe I didn't think of that. Thanks! Also, wait, where did I use single quotes instead of double quotes? @mart0903 — Ria, Jun 03 '15 at 19:25
Python prints strings with single qoutes by default. But you might have a different case/setup than me, so you should be fine :) Please accept if this answer is pleasing ;) — Martin Hallén, Jun 03 '15 at 22:45

score 0 · Answer 2 · answered Jun 03 '15 at 17:58

0

You can use pretty-print library.

import pprint

beautify = {"@timestamp":"2015-05-18T19:26:13.000Z......."} #Your Input
pp = pprint.PrettyPrinter(indent=4)
pp.pprint(beautify)

answered Jun 03 '15 at 17:58

user1767754

23,311
18
141
164

Using Python, how can I return a JSON file after performing functions on it and converting it to a dictionary?

2 Answers2