Nonnegative matrix factorization in Sklearn

Question

I am applying nonnegative matrix factorization (NMF) on a large matrix. Essentially the NMF method does the following: given an m by n matrix A, NMF decomposes into A = WH, where W is m by d and H is d by n. The ProjectedGradientNMF method is implemented in Python package Sklearn. I would want the algorithm return both W and H. But it seems that it only returns H, not W. Applying the algorithm again to A.T (the transpose) could give me W. However, I would want to avoid computing it twice since the matrix ix very large.

If you could tell me how to simultaneously get W and H, that would be great! Below is my code:

from sklearn.decomposition import ProjectedGradientNMF
import numpy
A = numpy.random.uniform(size = [40, 30])
nmf_model = ProjectedGradientNMF(n_components = 5, init='random', random_state=0)
nmf_model.fit(A)
H = nmf_model.components_.T

Could applying the algorithm again to A.T (the transpose) really give W ? I am not able to verify it. — svural, Jul 31 '14 at 16:49

score 19 · Accepted Answer · edited Jul 15 '14 at 11:53

19

Luckily you can look through the source code:
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/decomposition/nmf.py

fit_transform() starts at line 460, and at line 530 it shows that H gets attached to components_ and W is returned from the function.

So you shouldn't have to run this twice, you should just change:

nmf_model.fit(A);
H = nmf_model.components_.T;

to

W = nmf_model.fit_transform(A);
H = nmf_model.components_;

edited Jul 15 '14 at 11:53

Fred Foo

355,277
75
744
836

answered Jul 14 '14 at 19:29

5

Great. I think sklearn should clearly point this out. It focuses too much on feature extraction. – Shuai Zhang Jan 10 '16 at 06:32

Nonnegative matrix factorization in Sklearn

1 Answers1