-1

I am currently working with netflow data in a json file. My job is to parse the json file and perform specific actions on the data within it. After doing so, I'm creating a new file and adding each new updated json object to it. What's happening is that they are all on the same line. Is there any way to get each json object/line onto a different line? The original file given to me was like this as well, and it just looks neater. Thanks!

EDIT:

What I want:

    {"@timestamp":"2015-05-18T19:26:13.000Z","netflow":{"version":"9","flow_seq_num":"188189","flowset_id":"257","last_switched":"2015-05-15T14:28:01.999Z","first_switched":"2015-05-15T14:27:37.999Z","in_bytes":"4800","in_pkts":"2","input_snmp":"5","output_snmp":"4","ipv4_src_addr":"10.10.1.4","ipv4_dst_addr":"192.1.44.182","protocol":"6","src_tos":"2","dst_tos":"0","l4_src_port":"443","l4_dst_port":"12080","flow_sampler_id":"0","ipv4_next_hop":"10.10.1.5","dst_mask":"37","src_mask":"21","tcp_flags":"27","direction":"1"},"@version":"1","host":"192.168.19.202","src_host_name":"","dst_host_name":"","app_name":"","tcp_flags_str":"","dscp":"","highval":"","src_blacklisted":"0","dst_blacklisted":"0","invalid_ToS":"0","bytes_per_packet":2400,"tcp_nominal_payload":"0","malformed_ip":"0","empty_tcp":"0","short_tcp_handshake":"0","icmp_malformed_packets":"0","snort_attack_flow":"0","empty_udp":"0","short_udp":"0","short_tcp_rstack":"0","short_tcp_pansf":"0","short_tcp_synack":"0","short_tcp_synrst":"0","short_tcp_finack":"0","short_tcp_pna":"0","non_unicast_src":"0","multicast":"0","broadcast":"0","network":"0","tcp_urg":"0","land_attack":"0","short_tcp_ack":"0","tcp_synfin":"0","tcp_fin":"0","malformed_tcp":"1","tcp_xmas":"0","udp_echo_req":"0","tcp_null":"0","tcp_syn":"0","malformed_udp":"0","tcp_rst":"0","icmp_request":"0","icmp_response":"0","icmp_port_unreachable":"0","icmp_host_unreachable":"0","icmp_unreachable_for_Tos":"0","icmp_network_unreachable":"0","icmp_redirects":"0","icmp_time_exceeded_flows":"0","icmp_parameter_problem_flows":"0","icmp_trace_route":"0","icmp_datagram":"0","udp_echo_chargen_broadcast":"0","udp_chargen_echo_broadcast":"0","icmp_src_quench":"0","icmp_proto_unreachable":"0","udp_echo_broadcast":"0","udp_echo_rsp":"0", "hi": 10}
    {"@timestamp":"2015-05-18T19:26:13.000Z","netflow":{"version":"9","flow_seq_num":"188189","flowset_id":"257","last_switched":"2015-05-15T14:28:01.999Z","first_switched":"2015-05-15T14:27:37.999Z","in_bytes":"77","in_pkts":"2","input_snmp":"7","output_snmp":"2","ipv4_src_addr":"192.1.44.179","ipv4_dst_addr":"10.10.1.8","protocol":"6","src_tos":"0","dst_tos":"2","l4_src_port":"12192","l4_dst_port":"443","flow_sampler_id":"0","ipv4_next_hop":"10.10.1.7","dst_mask":"12","src_mask":"37","tcp_flags":"24","direction":"0"},"@version":"1","host":"192.168.19.202","src_host_name":"","dst_host_name":"","app_name":"","tcp_flags_str":"","dscp":"","highval":"","src_blacklisted":"0","dst_blacklisted":"0","invalid_ToS":"0","bytes_per_packet":38,"tcp_nominal_payload":"0","malformed_ip":"0","empty_tcp":"0","short_tcp_handshake":"0","icmp_malformed_packets":"0","snort_attack_flow":"0","empty_udp":"0","short_udp":"0","short_tcp_rstack":"0","short_tcp_pansf":"0","short_tcp_synack":"0","short_tcp_synrst":"0","short_tcp_finack":"0","short_tcp_pna":"0","non_unicast_src":"0","multicast":"0","broadcast":"0","network":"0","tcp_urg":"0","land_attack":"0","short_tcp_ack":"0","tcp_synfin":"0","tcp_fin":"0","malformed_tcp":"1","tcp_xmas":"0","udp_echo_req":"0","tcp_null":"0","tcp_syn":"0","malformed_udp":"0","tcp_rst":"0","icmp_request":"0","icmp_response":"0","icmp_port_unreachable":"0","icmp_host_unreachable":"0","icmp_unreachable_for_Tos":"0","icmp_network_unreachable":"0","icmp_redirects":"0","icmp_time_exceeded_flows":"0","icmp_parameter_problem_flows":"0","icmp_trace_route":"0","icmp_datagram":"0","udp_echo_chargen_broadcast":"0","udp_chargen_echo_broadcast":"0","icmp_src_quench":"0","icmp_proto_unreachable":"0","udp_echo_broadcast":"0","udp_echo_rsp":"0", "yes":10}
    {"@timestamp":"2015-05-18T19:59:59.000Z","netflow":{"version":"9","flow_seq_num":"189654","flowset_id":"257","last_switched":"2015-05-15T14:25:09.999Z","first_switched":"2015-05-15T14:24:45.999Z","in_bytes":"8400","in_pkts":"1","input_snmp":"7","output_snmp":"2","ipv4_src_addr":"10.10.1.2","ipv4_dst_addr":"192.1.109.32","protocol":"6","src_tos":"2","dst_tos":"0","l4_src_port":"443","l4_dst_port":"12816","flow_sampler_id":"0","ipv4_next_hop":"10.10.1.3","dst_mask":"45","src_mask":"3","tcp_flags":"19","direction":"1"},"@version":"1","host":"192.168.19.202","src_host_name":"","dst_host_name":"","app_name":"","tcp_flags_str":"","dscp":"","highval":"","src_blacklisted":"0","dst_blacklisted":"0","invalid_ToS":"0","bytes_per_packet":8400,"tcp_nominal_payload":"0","malformed_ip":"0","empty_tcp":"0","short_tcp_handshake":"0","icmp_malformed_packets":"0","snort_attack_flow":"0","empty_udp":"0","short_udp":"0","short_tcp_rstack":"0","short_tcp_pansf":"0","short_tcp_synack":"0","short_tcp_synrst":"0","short_tcp_finack":"0","short_tcp_pna":"0","non_unicast_src":"0","multicast":"0","broadcast":"0","network":"0","tcp_urg":"0","land_attack":"0","short_tcp_ack":"0","tcp_synfin":"0","tcp_fin":"0","malformed_tcp":"1","tcp_xmas":"0","udp_echo_req":"0","tcp_null":"0","tcp_syn":"0","malformed_udp":"0","tcp_rst":"0","icmp_request":"0","icmp_response":"0","icmp_port_unreachable":"0","icmp_host_unreachable":"0","icmp_unreachable_for_Tos":"0","icmp_network_unreachable":"0","icmp_redirects":"0","icmp_time_exceeded_flows":"0","icmp_parameter_problem_flows":"0","icmp_trace_route":"0","icmp_datagram":"0","udp_echo_chargen_broadcast":"0","udp_chargen_echo_broadcast":"0","icmp_src_quench":"0","icmp_proto_unreachable":"0","udp_echo_broadcast":"0","udp_echo_rsp":"0", "no":10}
    {"@timestamp":"2015-05-18T19:33:58.000Z","netflow":{"version":"9","flow_seq_num":"188525","flowset_id":"257","last_switched":"2015-05-15T14:27:22.999Z","first_switched":"2015-05-15T14:26:58.999Z","in_bytes":"8300","in_pkts":"2","input_snmp":"3","output_snmp":"6","ipv4_src_addr":"10.10.1.6","ipv4_dst_addr":"192.1.59.124","protocol":"6","src_tos":"2","dst_tos":"0","l4_src_port":"80","l4_dst_port":"12660","flow_sampler_id":"0","ipv4_next_hop":"10.10.1.4","dst_mask":"28","src_mask":"13","tcp_flags":"19","direction":"1"},"@version":"1","host":"192.168.19.202","src_host_name":"","dst_host_name":"","app_name":"","tcp_flags_str":"","dscp":"","highval":"","src_blacklisted":"0","dst_blacklisted":"0","invalid_ToS":"0","bytes_per_packet":4150,"tcp_nominal_payload":"0","malformed_ip":"0","empty_tcp":"0","short_tcp_handshake":"0","icmp_malformed_packets":"0","snort_attack_flow":"0","empty_udp":"0","short_udp":"0","short_tcp_rstack":"0","short_tcp_pansf":"0","short_tcp_synack":"0","short_tcp_synrst":"0","short_tcp_finack":"0","short_tcp_pna":"0","non_unicast_src":"0","multicast":"0","broadcast":"0","network":"0","tcp_urg":"0","land_attack":"0","short_tcp_ack":"0","tcp_synfin":"0","tcp_fin":"0","malformed_tcp":"1","tcp_xmas":"0","udp_echo_req":"0","tcp_null":"0","tcp_syn":"0","malformed_udp":"0","tcp_rst":"0","icmp_request":"0","icmp_response":"0","icmp_port_unreachable":"0","icmp_host_unreachable":"0","icmp_unreachable_for_Tos":"0","icmp_network_unreachable":"0","icmp_redirects":"0","icmp_time_exceeded_flows":"0","icmp_parameter_problem_flows":"0","icmp_trace_route":"0","icmp_datagram":"0","udp_echo_chargen_broadcast":"0","udp_chargen_echo_broadcast":"0","icmp_src_quench":"0","icmp_proto_unreachable":"0","udp_echo_broadcast":"0","udp_echo_rsp":"0", "bye": 10}

What is happening currently: Instead of being on separate lines, each JSON object is just right after another (so it's one big line when you open it up to read it). I want it to be separated line by line, like above. I would have shown the current output but I was going way over my character limit.

Hope this helps!

Ria
  • 1
  • 3

2 Answers2

0

It's difficult to know exactly know what your problem is without your code, but I guess you want to use json.dumps.

Example usage from docs:

>>> import json
>>> print json.dumps({'4': 5, '6': 7},
...                  indent=4, separators=(',', ': '))
{
    "4": 5,
    "6": 7
}

In your example, this would look something like this:

data = [{"@timestamp":"2015-05-18T19:26:13.000Z","netflow":{"version":"9"}},{"@timestamp":"2015-05-18T19:26:13.000Z","netflow":{"version":"9"}}]

with open("test.json", "w"):
    json.dumps(data, indent=1)

# test.json
[
 {
  "@timestamp": "2015-05-18T19:26:13.000Z",
  "netflow": {
   "version": "9"
  }
 },
 {
  "@timestamp": "2015-05-18T19:26:13.000Z",
  "netflow": {
   "version": "9"
  }
 }
]

EDIT: If you want each json-object on their own line, you could use:

with open("test.json", "w+") as f:
    for line in data:
        f.write(str(line) + "\n")

#test.json
{'@timestamp': '2015-05-18T19:26:13.000Z', 'netflow': {'version': '9'}}
{'@timestamp': '2015-05-18T19:26:13.000Z', 'netflow': {'version': '9'}}

Note that this is not valid json, as they have to be in an array. You might also want to replace '-quotes with "-quotes. That can be done with .replace("'", '"')

Martin Hallén
  • 1,492
  • 1
  • 12
  • 27
  • I actually don't want my json objects to be nested! I just want each separate json object to be on different lines. – Ria Jun 03 '15 at 18:00
  • The "\n" worked! I can't believe I didn't think of that. Thanks! Also, wait, where did I use single quotes instead of double quotes? @mart0903 – Ria Jun 03 '15 at 19:25
  • Python prints strings with single qoutes by default. But you might have a different case/setup than me, so you should be fine :) Please accept if this answer is pleasing ;) – Martin Hallén Jun 03 '15 at 22:45
0

You can use pretty-print library.

import pprint

beautify = {"@timestamp":"2015-05-18T19:26:13.000Z......."} #Your Input
pp = pprint.PrettyPrinter(indent=4)
pp.pprint(beautify)
user1767754
  • 23,311
  • 18
  • 141
  • 164