I'm using the hclust function in a large script applied to a df like in this example:
HClust <- hclust(d = dist(model.matrix(~-1 + A + B + C + D, df))^2, method = "centroid")
I would like to specify only once the variables in the df, eg. MgO, Zn, CaO... and when I call hclust() I would like to have them automatically.
I've tried creating a vector which will include the dataframe variables in the format that I will use for the hclust call. But the resulting dendrogram is not correct.
vars_for_clust <- paste(colnames(df),"+")
which gives the following:
vars_for_clust
[1] "A+" "B+" "C+"
and used this vector in the hclust call:
HClust <- hclust(d = dist(model.matrix(~-1 + vars_for_clust, df))^2, method = "centroid")
but something went wrong because even if it does not give an error, the resulting dendrogram is not correct (all the vertical lines are equal)
Thanks!!
Sample data in: https://github.com/esteful/kaixo