Sklearn TruncatedSVD is not return n, components

Question

I fitting an LSA model on TfIdf matrix. My original matrix has

(20, 22096) then I'm applying TruncatedSVD to perform the LSI/Reduction

svd = TruncatedSVD(n_components=200, random_state=42, n_iter=10) svdProfile = svd.fit_transform(profileLSAVectors) print(np.shape(svdProfile)) #result (20, 20)

instead of get (20,200) i'm getting (20, 20)

anyone has any idea about why ....?

Vivek Kumar · Answer 1 · 2020-12-28T08:58:23.920

2

Its the "expected" behaviour in most decomposition procedures in Scikit-learn.

I cannot find this mentioned in documentation of TruncatedSVD, but you can see the documentation for PCA, where its mentioned that:

n_components == min(n_samples, n_features)

You can try posting this on the scikit-learn github issues page to get more clarity.

edited Dec 28 '20 at 08:58

answered Mar 19 '18 at 05:37

Vivek Kumar

35,217
8
109
132

"Its the desired behaviour in most decomposition procedures" - can you please explain / refer a link to why this is? I'd like to know more. – Raghuveer Dec 26 '20 at 20:04
1

@Raghuveer. Better word should be "expected" instead of "desired". I am sorry but I dont have any resources. Maybe you can look into the linked documentation for PCA above and go through the research papers linked there to get details. – Vivek Kumar Dec 28 '20 at 08:57

Sklearn TruncatedSVD is not return n, components

1 Answers1