-2

I used hclust to cluster my data and cutree to specify the numbers of cluster to be 3. Is there any way that I can examine each of the cluster? By examine I mean to list out the cases/observations that are in e.g. the first cluster. I tried all the basic function that I know such as summary(), list()...but seem not relevant. Any function can do this?

If not, the cutree function returns a list of groups/clusters that each of my observation belongs to, something like this:

1,3,1,2,3,3,1

which indicates my first observation belongs to group 1, second belong to group three... I am thinking about how to extract the position from that list where e.g. group = 1, so it will return 1,3 and 7 since observations 1,3,7 are belong to group 1

Or I need to use a loop to count all the observations that belong to e.g. group 1 from that list?

Is my question clear?

BigData
  • 73
  • 1
  • 1
  • 4
  • No your question is not very clear, but In an attempt to answer. You have the list from the cutree function: 1,3,1,2,3,3,1, you can use this array to subset or group your original data frame in order to examine the individual clusters. FYI: Providing an example(data and output goes a long way in obtaining help in this forum. – Dave2e Aug 03 '16 at 23:44
  • I made it, thanks all! – BigData Aug 04 '16 at 12:47

2 Answers2

0

Does this help to get started?

nclust <- 10 

cutreeout <- cutree(hclustOutput, nclust)

Add them as a new column to your dataframe

mydata$cluster <- cutreeout

How many observations are in each cluster?

table(mydata$cluster)

Then you can do more stuff to interpret your clusters, and/or study subsets of your data.

knb
  • 9,138
  • 4
  • 58
  • 85
-1

This is a hint, not the answer. Here's the example of Hierarchical Clustering in R. You can try to use the functions table(), ggplot() in order to see observations per clusters.

Nick
  • 1,086
  • 7
  • 21