TL;DR: How to use the WeightedCluster
library (the wcKMedoids()
method in particular) as input to heatmap
, heatmap.2
or similar, to provide it with clustering info?
We are creating a heatmap from some binary data (yes/no values, represented as ones and zeros) in R, and need to adjust the weights of some of the rows for the column based clustering.
(they are generated from multi-choice categories into multiple binary yes/no-valued rows, and thus are getting over-represented).
I found the WeightedCluster library, which can do clustering with weights.
Now the question is how to use this library (the wcKMedoids()
method in particular) as input to heatmap
, heatmap.2
or similar?
I have tried the following code, which results in the error message below:
library(gplots)
library(WeightedCluster)
dataset <- "
F,T1,T2,T3,T4,T5,T6,T7,T8
A,1,1,0,1,1,1,1,1
B,1,0,1,0,1,0,1,1
C,1,1,1,1,1,1,1,0
D,1,1,1,0,1,1,1,0
E,0,1,0,0,1,0,1,0
F,0,0,1,0,0,0,0,0
G,1,1,1,0,1,1,1,1
H,1,1,0,0,0,0,0,0
I,1,0,1,0,0,1,0,0
J,1,1,1,0,0,0,0,1
K,1,0,0,0,1,1,1,1
L,1,1,1,0,1,1,1,1
M,0,1,1,1,1,1,1,1
N,1,1,1,0,1,1,1,1"
fakefile <- textConnection(dataset)
d <- read.csv(fakefile, header=T, row.names = 1)
weights <- c(1,1,1,1,1,1,1,1,1,1,1,1,1,1)
distf <- function(x) dist(x, method="binary")
wclustf <- function(x) wcKMedoids(distf(x),
k=8,
weights=weights,
npass = 1,
initialclust=NULL,
method="PAMonce",
cluster.only = FALSE,
debuglevel=0)
cluster_colors <- colorRampPalette(c("red", "green"))(256);
heatmap(as.matrix(d),
col=cluster_colors,
distfun = distf,
hclustfun = wclustf,
keep.dendro = F,
margins=c(10,16),
scale="none")
But running it gives:
Error in UseMethod("as.dendrogram") :
no applicable method for 'as.dendrogram' applied to an object of class "c('kmedoids', 'list')"
Apparently, wcKMedoids
is not a drop-in replacement for R's hclust
, but does anyone have some pointers on how to work around that?
UPDATE: The tiny progress I have made so far indicates that I should implement a method as.dendrogram.kmedoids
, that produces a similar output as hclust(dist(x))
. (Its output can be inspected in detail with dput
: dput(hclust(dist(x)))
). Ideas and pointers much welcome.