3

I'm working in some kind of NLP. I compare a daframe of articles with inputs words. The main goal is classify text if a bunch of words were found

I've tried to extract the values in the dictionary and convert into a list and then apply stemming to it. The problem is that later I'll do another process to split and compare according to the keys. I think if more practical to work directly in the dictionary.

search = {'Tecnology' : ['computer', 'digital', 'sistem'], 'Economy' : ['bank', 'money']}
words_list = list()
for key in search.keys():
    words_list.append(search[key])
search_values = [val for sublist in words_list for val in sublist]
search_values_stem = [stemmer.stem(word) for word in test]

I expect a dictionary stemmed to compare directly with the column of the articles stemmed

Chacho Fuva
  • 353
  • 1
  • 4
  • 17

1 Answers1

1

If I understood your question correctly, you are looking to apply stemming to the values of your dictionary (and not the keys), and in addition the values in your dictionary are all lists of strings.

The following code should do that:

def stemList(l):
    return([stemmer.stem(word) for word in l])

# your initial dictionary is called search (as in your example code)
#the following creates a new dictionary where stemming has been applied to the values

stemmedSearch = {}
for key in search:
    stemmedSearch[key] = stemList(search[key])
DBaker
  • 2,079
  • 9
  • 15