0

I'm setting up a Jupyter Notebook project for a Machine Learning project using Ibm Watson Studio and I keep getting a TypeError is not JSON serializable when I try to add datas from my Postgresql database table.

The full error output:

TypeError                                 Traceback (most recent call last)
<ipython-input-16-e72fac39b809> in <module>()
      1 classes = natural_language_classifier.classify('998520s521-nlc-1398', data_df_1.to_json())
----> 2 print(json.dumps(classes, indent=2))

/opt/conda/envs/DSX-Python35/lib/python3.5/json/__init__.py in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)
    235         check_circular=check_circular, allow_nan=allow_nan, indent=indent,
    236         separators=separators, default=default, sort_keys=sort_keys,
--> 237         **kw).encode(obj)
    238 
    239 

/opt/conda/envs/DSX-Python35/lib/python3.5/json/encoder.py in encode(self, o)
    198         chunks = self.iterencode(o, _one_shot=True)
    199         if not isinstance(chunks, (list, tuple)):
--> 200             chunks = list(chunks)
    201         return ''.join(chunks)
    202 

/opt/conda/envs/DSX-Python35/lib/python3.5/json/encoder.py in _iterencode(o, _current_indent_level)
    434                     raise ValueError("Circular reference detected")
    435                 markers[markerid] = o
--> 436             o = _default(o)
    437             yield from _iterencode(o, _current_indent_level)
    438             if markers is not None:

/opt/conda/envs/DSX-Python35/lib/python3.5/json/encoder.py in default(self, o)
    177 
    178         """
--> 179         raise TypeError(repr(o) + " is not JSON serializable")
    180 
    181     def encode(self, o):

TypeError: <watson_developer_cloud.watson_service.DetailedResponse object at 0x7f64ee350240> is not JSON serializable

And here's my python code in the Notebook that deploy the AI models to analyze theses datas:

from watson_developer_cloud import NaturalLanguageClassifierV1
import pandas as pd
import psycopg2


# Connecting to my database.
conn_string = 'host={} port={}  dbname={}  user={}  password={}'.format('159.***.20.***', 5432, 'searchdb', 'lcq09', 'Mys3cr3tPass')
conn_cbedce9523454e8e9fd3fb55d4c1a52e = psycopg2.connect(conn_string)
data_df_1 = pd.read_sql('SELECT description from public."search_product"', con=conn_cbedce2drf563454e8e9fd3fb8776fgh2e)

# Connecting to the ML model.
natural_language_classifier = NaturalLanguageClassifierV1(
    iam_apikey='TB97dFv8Dgug6rfi945F3***************'
)

# Apply the ML model to db datas
classes = natural_language_classifier.classify('9841d0z5a1-ncc-9076', data_df_1.to_json())
print(json.dumps(classes, indent=2))

I have tried to run this: print(data_df_1.to_json()) to make sure that the format is in Json and it is in the correct format as you can see below: ps: The datas below are random Lorem sentences, but will be product descriptions after testing.

{"description":{"0":"Lorem ipsum sjvh  hcx bftiyf,  hufcil, igfgvjuoigv gvj ifcil ,ghn fgbcggtc   yfctgg h vgchbvju.","1":"Lorem ajjgvc wiufcfboitf iujcvbnb hjnkjc  ivjhn oikgjvn uhnhgv 09iuvhb  oiuvh boiuhb mkjhv mkiuhygv m,khbgv mkjhgv mkjhgv.","2":"Lorem aiv ibveikb jvk igvcib ok blnb v  hb b hb bnjb bhb bhn bn vf vbgfc vbgv nbhgv bb nb nbh nj mjhbv mkjhbv nmjhgbv nmkn","3":"Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx","4":"Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx","5":"Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx"}}

Also, I'm able to classify a single sentence with the code below but I want to classify the description table of my whole database:

classes = natural_language_classifier.classify('998260x551-nlc-1018', 'How hot will it be today?')
print(json.dumps(classes.result, indent=2))

And that is why I replaced the sentence with the dataframe named data_df_1.

But I have a TypeError when I do it as mentioned,

so what should I do to fix this error?

locq
  • 301
  • 1
  • 5
  • 22
  • 1
    I think you're misreading the error. `TypeError: is not JSON serializable` The object is not serializeable not the TypeError. – Skam May 22 '19 at 19:36

1 Answers1

1

Your issue is that inside of your dataframe there is a watson_developer_cloud.watson_service.DetailedResponse that the JSON serializer Python module does not know how to handle.

Looking at the api it looks like you could call the detailed_response._to_dict instance method (this would be frowned upon because it uses a private method), or call the detailed_response.get_response method to get the dictionary to remove the data from the object.

Ideally, you preprocess the dataframe serializing that object using one of the two methods above to every row in your dataframe that contains that object, then the .to_json should not throw a TypeError with that column.

col = 'column_with_unserializable_type'
data_df_1[col] = data_df_1[col].map(lambda x: x.get_response)
Skam
  • 7,298
  • 4
  • 22
  • 31
  • Thanks for your answer! The problem is that I don't which column gives the error. So I don't know what to put inside of the `col = 'column_with_unserializable_type'`. Is there a way to find it? – locq May 22 '19 at 19:47
  • I just replaced `print(json.dumps(classes, indent=2))` to `print(json.loads(classes, indent=2))` and now I'm getting another error: `TypeError: the JSON object must be str, not 'DetailedResponse'`. – locq May 23 '19 at 03:08