I am having some issues in interpreting the results from prcomp()
.
Say I have a centered and scaled data.table called dat, with N columns and M rows. Indeed every column represents a feature and every row a record. I also got a M-dimensional vector of outcomes Y
.
I wanted to know what the PCA of this system says. So I just executed:
dat.pca=prcomp(dat,retx=TRUE)
By the elbow method I decided to retain 5 PCA modes, accounting for 90% of the variance. Then, I got the following data.table
:
dat.pcadata=as.data.table(dat.pca$x)
dat.pcadata
has M rows and N columns, and each column corresponds to a PCA mode.
My question is: do I understand correctly if I say that now my system should be trained to forecast the outcomes Y using the first 5 columns of dat.pcadata
as features?