I have some Apache access logs I want to parse using IPWhois
.
I want to group the IPWhois
results based on the asn_description
field.
Isn't the fact that the set
and the itertools.groupby()
in the following snippet yeild different outcomes?
descs = set()
with open(RESULTSFILE, 'a+') as r:
for description, items in groupby(results, key=lambda x: x['asn_description']):
print('ASN Description: ' + description)
descs.add(description)
print(descs)
e.g.
ASN Description: GOOGLE - Google LLC, US
ASN Description: AVAST-AS-DC, CZ
ASN Description: FACEBOOK - Facebook, Inc., US
ASN Description: AVAST-AS-DC, CZ
ASN Description: AMAZON-AES - Amazon.com, Inc., US
ASN Description: FACEBOOK - Facebook, Inc., US
ASN Description: AMAZON-02 - Amazon.com, Inc., US
ASN Description: AMAZON-02 - Amazon.com, Inc., US
ASN Description: GOOGLE - Google LLC, US
ASN Description: GOOGLE-2 - Google LLC, US
ASN Description: AMAZON-02 - Amazon.com, Inc., US
{'FACEBOOK - Facebook, Inc., US', 'AVAST-AS-DC, CZ', 'AMAZON-AES - Amazon.com, Inc., US', 'GOOGLE-2 - Google LLC, US', 'GOOGLE - Google LLC, US', 'AMAZON-02 - Amazon.com, Inc., US',