0

I used to work with IBM's Natural Language Understanding API for analyzing URLs. I am using Python's IBM Watson SDK 5.1 on Python 3.8.

I successfully used the code below [all approprioate options have been imported] to extract metadata, in addition to entities, concepts, etc:

def NLU_analysis(url):
try:
    response = natural_language_understanding.analyze(
        url=url, return_analyzed_text=True, clean=True, language=True,
        features=Features(keywords=KeywordsOptions(limit=10), 
                          entities=EntitiesOptions(limit=10), 
                          concepts=ConceptsOptions(limit=5), 
                          metadata=MetadataOptions(),
                          categories=CategoriesOptions(limit=5))).get_result()
    return response
except:
    pass

The code above used to return the metadata. Now, in Python SDK 5.1.0, IBM recently changed to way to retrieve the URL's metadata. The "MetadataOptions" feature has been replaced by "FeatureMetadataResults". If I use the code above and replace the MetadataOptions by FeatureMetadataResults as shown below:

def NLU_analysis(url):
try:
    response = natural_language_understanding.analyze(
        url=url, return_analyzed_text=True, clean=True, language=True,
        features=Features(keywords=KeywordsOptions(limit=10),
                          entities=EntitiesOptions(limit=10),
                          concepts=ConceptsOptions(limit=5),
                          metadata=FeaturesResultsMetadata(),
                          categories=CategoriesOptions(limit=5))).get_result()
    return response
except:
    pass

Now, if I run the modified code, I get the following error message: "TypeError: Object of type FeaturesResultsMetadata is not JSON serializable"

If I read IBM's documentation, I am getting somewhat confused (Link to the API documentation. Here's the IBM's code example:

import json
from ibm_watson import NaturalLanguageUnderstandingV1
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
from ibm_watson.natural_language_understanding_v1
    import Features, MetadataOptions

authenticator = IAMAuthenticator('{apikey}')
natural_language_understanding = NaturalLanguageUnderstandingV1(
    version='2020-08-01',
    authenticator=authenticator
)

natural_language_understanding.set_service_url('{url}')

response = natural_language_understanding.analyze(
    url='www.ibm.com',
    features=Features(metadata=MetadataOptions())).get_result()

Does anyone know whether it is still possible to retrieve an URL's metadata using IBM Watson's Natural Language Understanding API?

Have a nice day!

Joost Vos
  • 31
  • 4

1 Answers1

1

It appears that the sample in IBM's API-documentation is incorrect.

The code below has been pasted as plain text, to be able to strike through obsolete elements in IBM's sample code.

import json
from ibm_watson import NaturalLanguageUnderstandingV1
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
from ibm_watson.natural_language_understanding_v1

import Features, MetadataOptions

authenticator = IAMAuthenticator('{apikey}')
natural_language_understanding = NaturalLanguageUnderstandingV1(
    version='2020-08-01',
    authenticator=authenticator
)

natural_language_understanding.set_service_url('{url}')

response = natural_language_understanding.analyze(
    url='www.ibm.com',

features=Features(metadata=MetadataOptions() {} )).get_result()

print(json.dumps(response, indent=2))

So for requesting the metadata object, just provide an empty dictionary (metadata={})

Enjoy your day! Joost

Joost Vos
  • 31
  • 4