0

My question is how to use the principal components obtained using R.

Once you get the principal components, how do we use it to reduce the dimensions? I have a data_set containing 6 variables, I need to cluster it using k-means. K-means gives me a scattered plot when I do the clustering on 6 variables. I thought pca could help to reduce the dimensions, and so k-means could produce fruitful results.

I did this to get the principal components:

pca1 <- prcomp(data_set)

Please guide me as to how to proceed further to reduce the dimensionality of the data set.

Grimthorr
  • 6,856
  • 5
  • 41
  • 53
N2M
  • 199
  • 1
  • 15
  • The first 2 or 3 components will give you a 2D or 3D representation of your data. Work on those first few components to get the dimension reduction. – Roman Luštrik Sep 16 '13 at 09:15
  • Instead of PCA you can use SVD and get the same results. You can later plot the eigen values, – user1436187 Sep 16 '13 at 11:18

1 Answers1

1

you can find the values you get from a function if you type for example ?prcomp this is what i used to do using another package:

library("FactoMineR")

pca <- PCA(dataset, scale.unit=TRUE, graph=FALSE)

scores <- data.frame(pca$ind$coord)

library(ggplot2)

ggplot(scores,aes(Dim.1,Dim.2)) + geom_text(label=rownames(scores),colour="red") + geom_hline(yintercept=0) + geom_vline(xintercept=0) + labs(title="Score plot")

you get the plot for the scores according to PC1 and PC2, and the same if you want the loadings plot

loadings <- data.frame(pca$var$coord)