I used to work with IBM's Natural Language Understanding API for analyzing URLs. I am using Python's IBM Watson SDK 5.1 on Python 3.8.
I successfully used the code below [all approprioate options have been imported] to extract metadata, in addition to entities, concepts, etc:
def NLU_analysis(url):
try:
response = natural_language_understanding.analyze(
url=url, return_analyzed_text=True, clean=True, language=True,
features=Features(keywords=KeywordsOptions(limit=10),
entities=EntitiesOptions(limit=10),
concepts=ConceptsOptions(limit=5),
metadata=MetadataOptions(),
categories=CategoriesOptions(limit=5))).get_result()
return response
except:
pass
The code above used to return the metadata. Now, in Python SDK 5.1.0, IBM recently changed to way to retrieve the URL's metadata. The "MetadataOptions" feature has been replaced by "FeatureMetadataResults". If I use the code above and replace the MetadataOptions by FeatureMetadataResults as shown below:
def NLU_analysis(url):
try:
response = natural_language_understanding.analyze(
url=url, return_analyzed_text=True, clean=True, language=True,
features=Features(keywords=KeywordsOptions(limit=10),
entities=EntitiesOptions(limit=10),
concepts=ConceptsOptions(limit=5),
metadata=FeaturesResultsMetadata(),
categories=CategoriesOptions(limit=5))).get_result()
return response
except:
pass
Now, if I run the modified code, I get the following error message: "TypeError: Object of type FeaturesResultsMetadata is not JSON serializable"
If I read IBM's documentation, I am getting somewhat confused (Link to the API documentation. Here's the IBM's code example:
import json
from ibm_watson import NaturalLanguageUnderstandingV1
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
from ibm_watson.natural_language_understanding_v1
import Features, MetadataOptions
authenticator = IAMAuthenticator('{apikey}')
natural_language_understanding = NaturalLanguageUnderstandingV1(
version='2020-08-01',
authenticator=authenticator
)
natural_language_understanding.set_service_url('{url}')
response = natural_language_understanding.analyze(
url='www.ibm.com',
features=Features(metadata=MetadataOptions())).get_result()
Does anyone know whether it is still possible to retrieve an URL's metadata using IBM Watson's Natural Language Understanding API?
Have a nice day!