I have a JSON file which I converted to string to remove HTML tags, but the function returns unicode values as shown below:
[u'', u'', u'', u'c', u'i', u's', u' ', u'b', u'y', u' ', u'd', u'e', u'l', u'o', u'i', u't', u't', u'e', u'']
I want to extract the words from above output cis by deloitte. Let me know how to resolve this. The code I have tried is shown below:
def cleaning_data(input_json_data):
jd = input_json_data['description']
jd = [x.lower() for x in jd]
jd = str(jd)
jd = re.sub('<[^>]*>', '', jd)
print jd