Hi my frinds the observation is the following. I don't know what the problem is.
When I am making clusters with the hclust function, the labels of the object that it creates are lost if the way I subset the data frame is "incorrect".
This is the data frame.
set.seed(1234)
x <- rnorm(12,mean=rep(1:3,each=4),sd=0.2)
y <- rnorm(12,mean=rep(c(1,2,1),each=4),sd=0.2)
z <- as.factor(sample(c("A","B"),12,replace=T))
df <- data.frame(x=x,y=y,z=z)
plot(df$x,df$y,col=z,pch=19,cex=2)
This chunck of code returns NULL for the labels.
df1 <- df[c("x","y")]
d <- dist(df1)
cluster <- hclust(d)
cluster$labels #NULL
This chunck of code returns NULL as well.
df2 <- df[,1:2]
d <- dist(df2)
cluster <- hclust(d)
cluster$labels #NULL
This chunck of code does not return NULL.
df3 <- df[1:12,1:2]
d <- dist(df3)
cluster <- hclust(d)
cluster$labels #Character Vector
This has represented a problem for me because I have some codes that uses this information.
As you can see, the data frames are identical.
identical(df1, df2) #True
identical(df1, df3) #True
identical(df2, df3) #True