I plan to use sklearn.decomposition.TruncatedSVD
to perform LSA for a Kaggle
competition, I know the math behind SVD and LSA but I'm confused by
scikit-learn's user guide, hence I'm not sure how to actually apply
TruncatedSVD
.
In the doc, it states that:
After this operation,
U_k * transpose(S_k)
is the transformed training set withk
features (calledn_components
in the API)
Why is this? I thought after SVD, X
, at this time X_k
should be U_k * S_k * transpose(V_k)
?
And then it says,
To also transform a test set
X
, we multiply it withV_k
:X' = X * V_k
What does this mean?