0

I am doing cluster analysis based on data "college" which consists of 3 nominal and 20 numeric variables.

# select the columns based on the clustering results
cluster_1 <- mat[which(groups==1),]

#"cluster_1" is a data set which is made by cluster analysis consisting of 125 observations.


rbind(cluster_1[, -(1:3)], colMeans(cluster_1[, -(1:3)]))
#This is process of calculating each column's mean and attach the means to the bottom of the data set, "cluster_1".

Now what I want to know is how to calculate each column's sample variance and sample deviation and how to attach them to the bottom of the data set "cluster_1".

Please let me know.

  • From a design point of view, adding summary statistics at the bottom of your data.frame is pretty bad. It means you won't be able to do much more analysis on the data now that it contains apples and oranges. You better keep them in a separate data structure. – flodel Nov 29 '13 at 00:56

1 Answers1

0
  rbind(cluster_1, apply(cluster_1,2,sd), apply(cluster_1, 2, var) )
IRTFM
  • 258,963
  • 21
  • 364
  • 487