I am using the rich library to parse json data retrieved with aiohttp. It works great printing the data directly from the API, formatting nicely (with line breaks so that it is not hard to read):
{
'city': 'Haidian',
'region_code': 'BJ',
'os': None,
'tags': [],
'ip': 1699530633,
'isp': 'China Education and Research Network Center',
'area_code': None,
'longitude': 116.28868,
'last_update': '2021-12-16T05:42:00.377583',
'ports': [8888],
'latitude': 39.99064,
'hostnames': [],
'postal_code': None,
'country_code': 'CN',
'country_name': 'China',
'domains': [],
'org': 'China Education and Research Network',
'data': [
{
'_shodan': {'options': {}, 'id': '1d25e274-18ce-4a3d-8e1c-73e5bf35bf76', 'module': 'http-simple-new', 'crawler': '42f86247b760542c0192b61c60405edc5db01d55'},
'hash': -1008250258,
'os': None,
'opts': {},
'timestamp': '2021-12-16T05:42:00.377583',
'isp': 'China Education and Research Network Center',
'port': 8888,
'hostnames': [],
'location': {'city': 'Haidian', 'region_code': 'BJ', 'area_code': None, 'longitude': 116.28868, 'country_name': 'China', 'postal_code': None, 'country_code': 'CN', 'latitude': 39.99064},
'ip': 1699530633,
'domains': [],
'org': 'China Education and Research Network',
'data': 'GET / HTTP/1.1\r\nHost: 101.76.199.137\r\n\r\n',
'asn': 'AS4538',
'transport': 'tcp',
'ip_str': '101.x.199.x'
}
],
'asn': 'AS4538',
'ip_str': '101.x.199.x'
}
The program then appends that to a dictionary like:
ipInfo = {}
async def host(ip):
ret = await fetch(ip)
ipInfo[ip] = ret
Then after its is finished with a list of ip addresses it writes this dictionary to a file. The issue I am having is that when I load this data to review at a later time and attempt to parse it, the rich library does not format it nicely the way that it does when it is just coming from the API. It always ends up looking like:
[{'hash': -644847518, 'timestamp': '2021-12-27T15:08:16.109960', 'isp': 'VNPT Corp', 'transport': 'tcp', 'data': 'GET / HTTP/1.1\r\nHost: 113.x.185.x\r\n\r\n', 'asn': 'AS45899', 'port': 5555, 'hostnames': ['static.vnpt.vn'],
'location': {'city': 'Vị Thanh', 'region_code': '73', 'area_code': None, 'longitude': 105.47012, 'latitude': 9.78449, 'postal_code': None, 'country_code': 'VN', 'country_name': 'Viet Nam'}, 'ip': 1906751888, 'domains': ['vnpt.vn'],
'org': 'Vietnam Posts and Telecommunications Group', 'os': None, '_shodan': {'crawler': 'd905ab419aeb10e9c57a336c7e1aa9629ae4a733', 'options': {}, 'id': '33f5bd73-c7d7-4dc0-beb8-b17afb53d931', 'module': 'http-simple-new', 'ptr':
True}, 'opts': {}, 'ip_str': '113.x.185.x'}], 'asn': 'AS45899', 'city': 'Vị Thanh', 'latitude': 9.78449, 'isp': 'VNPT Corp', 'longitude': 105.47012, 'last_update': '2021-12-27T15:08:16.109960', 'country_name': 'Viet Nam',
'ip_str': '113.x.185.x', 'os': None, 'ports': [5555]}
And that does not work for me because I need to be able to actually read it. The code I am currently using to parse it looks like:
if argsc.parse:
_print(f'Opening {argsc.parse}')
with open(argsc.parse, 'r') as f:
f = f.read()
rich.print(f)
exit(0)
I have tried using rich.print_json
and parsing the dictionary entries one at a time, all sorts of things really. I did notice while writing this post that if the data is saved like it is in the first example with the nice newlines formatting then it does parse correctly, but I don't know how to do that either.
So my question is (guess it is two questions): 1) How do I save the data from rich so that it is saved the way that I see it on the screen? And: 2) How I do parse json data in a file with the nice newline formatting seen in the first example? Is that even possible? Maybe that is the way it comes back the API and it is being written differently. But I tried writing the data as-is without appending it to a dictionary and that did not work either.