0

I'm trying to parse fields of a json output from an API query.

Example json:

{
   "_id":"611a7b651571d300074b9875",
   "Details":{
      "Source":{
         "Type":"WHOIS servers",
         "URL":"http://tiktea.com.au",
         "NetworkType":"ClearWeb"
      },
      "Type":"Phishing",
      "SubType":"RegisteredSuspiciousDomain",
      "Severity":"Medium"
   },
   "Assignees":[
      
   ],
   "FoundDate":"2021-08-16T14:51:17.747Z",
   "Assets":[
      {
         "Type":"Domains",
         "Value":"iktea.net"
      }
   ],
   "TakedownStatus":"NotSent",
   "IsFlagged":false,
   "Closed":{
      "IsClosed":false
   }
}

I want to parse some fields, and get something like:

URL: http://tiktea.com.au
FoundDate: 2021-08-16T14:51:17.747Z"

so I tried this, but didn't work: Selecting fields from JSON output

This is my python code:

import requests
import json


#UserID + Api Key
auth = ('USERID','APIKEY')
headers = {    'Content-Type': "application/json",    }


url = "https://api.intsights.com/public/v1/data/alerts/alerts-list?alertType=Phishing
alerts = requests.request("GET", url, auth = auth, headers=headers)

for alertID in alerts.text.split('","'):
  url = "https://api.intsights.com/public/v1/data/alerts/get-alert/"+alertID #<-----This is the request that extract the JSON
  alertDetails = requests.request("GET", url, auth = auth, headers=headers)
  

  dict = alertDetails.text
  url=dict['Assets'][0]['Value'] #<---------Trying to parse the desired value
  print(url)

This is the result:

Traceback (most recent call last):
  File "cefexample.py", line 73, in <module>
    url=dict['Assets'][0]['Value']
TypeError: string indices must be integers

Any suggestion?

Balastrong
  • 4,336
  • 2
  • 12
  • 31
Richard
  • 63
  • 1
  • 9
  • could you update your question to show what the `alerts` response request looks like? – David Culbreth Sep 12 '21 at 19:31
  • @DavidCulbreth the response of 'alerts' is: ["6132fab73125ec00078e7180","6133790774446400078f665a","6133791144b6650007c6a3f4","61337911c406420008499cb9","61337c8792587f000b68c127"] – Richard Sep 12 '21 at 19:41

1 Answers1

0

First you need to parse the JSON that comes as response.

data = alertDetails.json() # use convenience method

Then access fields you want like a nested dict.

url = data['Details']['Source'].get('URL')
print (f'Phishing URL: {url}')
found_date = data.get('FoundDate')
print(f'Found date: {found_date}')

This assumes (based on error you show) that Deatails and Source will always be present, ony URL key may be missing. To be on the safe side you can do

url = data.get('Details', {}).get('Source', {}).get('URL')
print (f'Phishing URL: {url}')

or use try/except:

try:
    url = data['Details']['Source']['URL']
except KeyError:
    url = None
print (f'Phishing URL: {url}')
buran
  • 13,682
  • 10
  • 36
  • 61
  • Result: ```Traceback (most recent call last): File "cefexample.py", line 74, in url = data['Details']['Source']['URL'] KeyError: 'URL'``` The strange thing is, with other values it works: if i use: ```url = data['Details']['Source']['NetworkType']``` It works:```python cefexample.py``` result: ```ClearWeb 2021-02-18T16:02:20.926Z``` – Richard Sep 12 '21 at 19:52
  • It works with the example data provided. Is it possible that `URL` key is missing for particular response? You can use `dict.get()` method to handle missing keys – buran Sep 12 '21 at 20:04
  • I did: ```data = alertDetails.json() data2=data.get('Details') data3=data2.get('Source') data4= data3.get('URL') print ('Phishing URL: '+str(data4))``` and IT WORKS!! :D But, as you can see, it is so long. There's a way to do te same but summarized? – Richard Sep 12 '21 at 20:39