2

I have created a NMF topic model in python the code snippet for which is as follows:

def select_vectorizer(req_ngram_range=[1,2]):
    ngram_lengths = req_ngram_range
    vectorizer = TfidfVectorizer(analyzer='word', ngram_range=(ngram_lengths), stop_words='english', min_df=2)
    #print("User specified custom stopwords: {} ...".format(str(custom_stopwords)[1:-1]))
    return vectorizer

vectorizer = select_vectorizer([2,5])
X = vectorizer.fit_transform(new_review_list)


clf = decomposition.NMF(n_components=20, random_state=3, alpha = .1).fit(X)
vocab = vectorizer.get_feature_names()
print_top_words(clf, vocab, num_top_words)

which created 20 topics like the following:

Topic #0:
[u'blocks available', u'delivery blocks available', u'notifications blocks', u'notifications blocks available', u'new blocks', u'know blocks available', u'new blocks available', u'know blocks', u'open blocks available', u'available work', u'zero blocks', u'like blocks', u'notification blocks', u'day blocks', u'slow blocks', u'10 blocks', u'option set', u'logged 10', u'notification blocks available', u'day blocks available']
Topic #1:
[u'amazon flex', u'working amazon', u'amazon flex app', u'working amazon flex', u'hello amazon', u'hello amazon flex', u'flex delivery', u'amazon flex delivery', u'flex team', u'amazon flex team', u'work amazon', u'amazon flex support', u'flex support', u'work amazon flex', u'deliver amazon', u'hi amazon flex', u'hi amazon', u'deliver amazon flex', u'signed amazon', u'love amazon'] and so on..

Now I want to test this out on new texts, such that it categorizes those texts based on these categories. How do I do that?

Arman
  • 827
  • 3
  • 14
  • 28
  • Just to clarify, this is about the NMF in scikit-learn right? If so, please add the tag for scikit-learn and maybe take out the tf-idf – alvas May 19 '17 at 01:30
  • Can anyone answer this please!! – Arman May 19 '17 at 07:13
  • 1
    Use the `transform` method in the NMF class, [here](http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.NMF.html) in the docs... – piman314 May 19 '17 at 12:31
  • @ncfirth Can you explain a bit more please? I want to run the fitted NMF model on new text such that it tells me which topic (top topic) the new text belongs to. – Arman May 23 '17 at 00:04

0 Answers0