0

I have made a reproducible example where I am having trouble with pvclust. My goal is to pick the ideal clusters in a hierarchal cluster dendogram. I've heard of 'pvclust' but can't figure out how to use it. Also if anyone has other suggestions besides this to determine the ideal clusters it will be really helpful.

My code is provided.

library(pvclust)    

employee<- c('A','B','C','D','E','F','G','H','I',
         'J','K','L','M','N','O','P',
         'Q','R','S','T',
         'U','V','W','X','Y','Z')   
salary<-c(20,30,40,50,20,40,23,05,56,23,15,43,53,65,67,23,12,14,35,11,10,56,78,23,43,56) 
testing90<-cbind(employee,salary)
testing90<-as.data.frame(testing90)
head(testing90)
testing90$salary<-as.numeric(testing90$salary)
row.names(testing90)<-testing90$employee
testing91<-data.frame(testing90[,-1])
head(testing91)
row.names(testing91)<-testing90$employee
d<-dist(as.matrix(testing91))
hc<-hclust(d,method = "ward.D2")
hc
plot(hc)

par(cex=0.6, mar=c(5, 8, 4, 1))
plot(hc, xlab="", ylab="", main="", sub="", axes=FALSE)
par(cex=1)
title(xlab="Publishers", main="Hierarchal Cluster of Publishers by eCPM")
axis(2)

fit<-pvclust(d, method.hclust="ward.D2", nboot=1000, method.dist="eucl") 

An error came up stating:

Error in names(edges.cnt) <- paste("r", 1:rl, sep = "") : 
  'names' attribute [2] must be the same length as the vector [0]
analytics
  • 149
  • 1
  • 9
  • could you specify the libraries you are using in your MRE? – erasmortg Sep 30 '15 at 15:58
  • just added the library(pvclust) @erasmortg – analytics Sep 30 '15 at 16:01
  • 1
    Hi @analytics, following the answer you got, I'll mention that if you also want to visualize the results, you can consult the following: https://cran.r-project.org/web/packages/dendextend/vignettes/introduction.html#pvclust documentation from the dendextend R package: – Tal Galili Oct 01 '15 at 12:11

1 Answers1

1

A solution would be to force your object d into a matrix.

From the helpfile of pvclust:

data numeric data matrix or data frame.

Note that by forcing an object of type dist into a marix, as it was a diagonal it will get 'reflected' (math term escapes me right now), you can check the object that is being taken into account with the call:

as.matrix(d)

This would be the call you are looking for:

#note that I can't 
pvclust(as.matrix(d), method.hclust="ward.D2", nboot=1000, method.dist="eucl")
#Bootstrap (r = 0.5)... Done.
#Bootstrap (r = 0.58)... Done.
#Bootstrap (r = 0.69)... Done.
#Bootstrap (r = 0.77)... Done.
#Bootstrap (r = 0.88)... Done.
#Bootstrap (r = 1.0)... Done.
#Bootstrap (r = 1.08)... Done.
#Bootstrap (r = 1.19)... Done.
#Bootstrap (r = 1.27)... Done.
#Bootstrap (r = 1.38)... Done.
#
#Cluster method: ward.D2
#Distance      : euclidean
#
#Estimates on edges:
#
#      au    bp se.au se.bp      v      c  pchi
#1  1.000 1.000 0.000 0.000  0.000  0.000 0.000
#2  1.000 1.000 0.000 0.000  0.000  0.000 0.000
#3  1.000 1.000 0.000 0.000  0.000  0.000 0.000
#4  1.000 1.000 0.000 0.000  0.000  0.000 0.000
#5  1.000 1.000 0.000 0.000  0.000  0.000 0.000
#6  1.000 1.000 0.000 0.000  0.000  0.000 0.000
#7  1.000 1.000 0.000 0.000  0.000  0.000 0.000
#8  1.000 1.000 0.000 0.000  0.000  0.000 0.000
#9  1.000 1.000 0.000 0.000  0.000  0.000 0.000
#10 1.000 1.000 0.000 0.000  0.000  0.000 0.000
#11 1.000 1.000 0.000 0.000  0.000  0.000 0.000
#12 1.000 1.000 0.000 0.000  0.000  0.000 0.000
#13 1.000 1.000 0.000 0.000  0.000  0.000 0.000
#14 1.000 1.000 0.000 0.000  0.000  0.000 0.000
#15 1.000 1.000 0.000 0.000  0.000  0.000 0.000
#16 1.000 1.000 0.000 0.000  0.000  0.000 0.000
#17 1.000 1.000 0.000 0.000  0.000  0.000 0.000
#18 1.000 1.000 0.000 0.000  0.000  0.000 0.000
#19 0.853 0.885 0.022 0.003 -1.126 -0.076 0.058
#20 0.854 0.885 0.022 0.003 -1.128 -0.073 0.069
#21 0.861 0.897 0.022 0.003 -1.176 -0.090 0.082
#22 0.840 0.886 0.024 0.003 -1.100 -0.106 0.060
#23 0.794 0.690 0.023 0.005 -0.658  0.162 0.591
#24 0.828 0.686 0.020 0.005 -0.716  0.232 0.704
#25 1.000 1.000 0.000 0.000  0.000  0.000 0.000

Note that this method will fix your call, but the validity of the clustering method, and quality of your data is for you to decide. Your MRE was trusted.

erasmortg
  • 3,246
  • 1
  • 17
  • 34