2

I'm wondering what's the way in which for a given depth cutoff in dendrogram I can get for each branch below that depth cutoff a list of the names of all the leaves which are its descendants.

For example I create this dendrogram:

set.seed(1)
mat <- matrix(rnorm(100*10),nrow=100,ncol=10)
dend <- as.dendrogram(hclust(dist(t(mat))))

Plotting it using dendextend:

require(dendextend)
dend %>% plot

And defining the depth cutoff as 14.5:

abline(h=14.5,col="red")

enter image description here

my list should be:

list(c(5),c(7),c(8),c(10,4,9),c(3,6,1,2))
dan
  • 6,048
  • 10
  • 57
  • 125

2 Answers2

1
set.seed(1)
mat <- matrix(rnorm(100*10),nrow=100,ncol=10)
dend <- as.dendrogram(hclust(dist(t(mat))))

require(dendextend)
dend %>% plot
abline(h=14.5,col="red")

The cutree function in dendextend accepts a height cutoff value and will return an integer vector with group memberships:

> cutree(dend,h=14.5)
 1  2  3  4  5  6  7  8  9 10 
 1  1  1  2  3  1  4  5  2  2 
dan
  • 6,048
  • 10
  • 57
  • 125
0

Not entirely sure if this is the answer you are after, but can you just access them like this?

acme$Accounting$children %>% names()
"New Software"             "New Accounting Standards"

acme$IT$children %>% names()
"Outsource"   "Go agile"    "Switch to R"

Presumably you want to do this automatically so then it would be something like

names = c('Accounting', 'IT')
sapply(names, function(x) acme[[x]]$children %>% names(.))

There is probably a more elegant way to do this I think, but this doesn't look like a terrible way to do it.

EDIT

Since the user completely changed the question here is a new answer here:

get_height = function(x){
  a = attributes(x)
  a$height
}

height = 14
dendrapply(dend, function(x) ifelse(get_height(x) < height, x, '')) %>% unlist()

You just need to access the height of each terminal node in the dendrogram and determine if it is above or below the height you want it to be. Unfortunately this won't group together the leaf nodes that come from the same parent - however, this shouldn't be too difficult to add on with a bit of tinkering. Hopefully this gets you on your way.

SamPassmore
  • 1,221
  • 1
  • 12
  • 32
  • just a note: You need the magrittr package to use the %>% function, but it is not crucial to the code - just makes things look nicer. – SamPassmore Jan 08 '17 at 11:41
  • Ok - this question is clearer, but it is an entirely new question. Before you were asking about taxonomy like structures using the data.tree package. Now you are asking about dendrograms using the stats package. What are you trying to achieve? – SamPassmore Jan 08 '17 at 21:23
  • You're right. Sorry about that. I edited the title and some details in the question, and it is no longer directly related to data.tree. The only connection to data.tree is that I use it to find the depth cutoff given the dendrogram. And since now this is a given in the question I removed data.tree from the question – dan Jan 08 '17 at 21:27