Questions tagged [nmf]

Non-negative matrix factorization (NMF or NNMF), also non-negative matrix approximation is a group of algorithms in multivariate analysis and linear algebra where a matrix V is factorized into (usually) two matrices W and H, with the property that all three matrices have no negative elements.

is a technique to approximate a matrix like V = WH. Here dimension of V,W,H can be respectively m*n, m*p, p*n where p << n usually. Now W can be thought as a weight matrix for hidden variables. As p can be very small this can also be viewed as a dimensionality reduction technique like .

is widely applicable in most real world cases where V can't have negative values like , , recommender system etc. General applications of include:

For this tag users should provide mathematical clarity as it is an advanced topic along with information about application to specific case.

Useful links:

77 questions
3
votes
2 answers

R NMF package: How to extract sample classifications?

In the NMF R-package one can use consensusmap() to visualise outputs. The plots show which samples belong to which clusters in the "consensus" track. I would like to extract this sample classification such that I get a data frame like this: Sample …
Esben Eickhardt
  • 3,183
  • 2
  • 35
  • 56
3
votes
2 answers

How can I correctly use Pipleline with MinMaxScaler + NMF to predict data?

This is a very small sklearn snipplet: logistic = linear_model.LogisticRegression() pipe = Pipeline(steps=[ ('scaler_2', MinMaxScaler()), ('pca', decomposition.NMF(6)), ('logistic', logistic), ]) from sklearn.cross_validation…
Bear Huang
  • 33
  • 1
  • 5
2
votes
0 answers

ERROR: long vectors (argument 1) are not supported in .C [in call to 'silhouette.default'] with R package NMF

Currently I'm running r version 3.6.0 (2019-04-26) on a debian server with 264 GB memory and Intel(R) Xeon(R) CPU. Now I'm trying to run the nmf calculating for a about (5*10^4) * 1100 matrix, it works well when I set a specific rank, such…
zzbb2266
  • 43
  • 3
2
votes
1 answer

Surprise NMF throws ZeroDivisionError: float division

I'm trying to do a basic recommendation system. I use Surprise's NMF model for this. Here is my dataset just before starting to work with NMF: store_id item_id quantity 0 62693933 912003029 3.000 1 62693933 912003034 4.000 2…
emremrah
  • 1,733
  • 13
  • 19
2
votes
0 answers

Scikit-learn NMF removing duplicate words

I'm using scikit-learn's nmf algorithm to extract trending words from some blogs. For example I have "game thrones"( which is good although "of" is dropped as stopword ), but I also have "game" and "thrones". I have "marcus hutchins"(good) but I…
2
votes
1 answer

Non negative matrix factorisation in python on individual images

I am trying to apply NMF to a particular image that is loaded in grayscale mode. I have tried several links but my image after application of NMF remains almost the same and cannot be distinguished with the grayscale image initially loaded. However,…
2
votes
0 answers

How to test the trained NMF topic model on new text

I have created a NMF topic model in python the code snippet for which is as follows: def select_vectorizer(req_ngram_range=[1,2]): ngram_lengths = req_ngram_range vectorizer = TfidfVectorizer(analyzer='word', ngram_range=(ngram_lengths),…
Arman
  • 827
  • 3
  • 14
  • 28
2
votes
1 answer

IndexError: out of bounds using NMF in sklearn

I am attempting to create topic models from a corpus of data. The code is able to properly use NMF to generate the tasked number of topics from the parsed data, however it breaks when the corpus length = 20, as seen below 20 [u'bell', u'closed',…
sudo_coffee
  • 888
  • 1
  • 12
  • 26
1
vote
1 answer

How to determine which document falls under a particular topic after applying topic modelling techniques like NMF, LDA, BERTopic?

Is there any way I can map generated topic from LDA, NMF and BERTopic to the list of documents and identify to which topic it belongs to? Click here to view Example
Navya
  • 11
  • 2
1
vote
2 answers

get_coherence : C_V method gets an error but U_Mass works

I'm using the following code to check the coherence value. The problem is code below works well when I change the coherence type into "u_mass", but if I want to compute "c_v", an Index error occure. Previous text process: # Remove Stopwords, Form…
Victoria L
  • 45
  • 5
1
vote
1 answer

Unable to find dot product of two matrix (W and H from NMF ) with same inner dimensions

I am doing Non-Negative Matrix Factorization (NMF) of a matrix A in R. It has Genes on rows and Samples on the columns. For NMF, I am using the CRAN package NMF. Once the basis matrix W and coefficient matrix H are computed, I want to check whether…
sp29
  • 363
  • 4
  • 11
1
vote
1 answer

Topic Modelling - I have used NMF and LDA, what is next?

I have used NMF and LDA for topic modelling in Python, with what I would call good results with NMF, and poor results with LDA. My data is highly domain specific, with a lot of unique/specific vocabulary. I am trying to improve my NMF output by…
Prolle
  • 358
  • 1
  • 10
1
vote
1 answer

Short text in the context of topic modeling

I am working on topic modeling and I am curious what exactly would be short text under this context?For example, if there is a research paper ,would the research paper's title and abstract be considered as short text?
Sri Test
  • 389
  • 1
  • 4
  • 21
1
vote
1 answer

NMF with negative values Python

I'm working with the Scikit-Learn NMF algorithm and I would like to know if there is any way to use negative values with the algorithm, I need it to work with BVH files. I'm using python 3.7.5 import numpy as np import re from sklearn.decomposition…
1
vote
0 answers

Scikit-learn NMF return NAN values

I am working with a 6650254x5650 sparse matrix which values are in numpy.float64 format. I am using the NMF implemetnation from scikit-learn as following from sklearn.decomposition import NMF model = NMF(n_components=12, init='random',…
Areza
  • 5,623
  • 7
  • 48
  • 79