Just a couple thoughts, having used mclust a bit previously.
1) mclust uses the correct BIC selection method; see this post:
https://stats.stackexchange.com/questions/237220/mclust-model-selection
See the very bottom, but to sum it up, with BIC it depends if you use the negative sign in the formula or not whether you optimize low vs. high:
The general definition of the BIC is
BIC=−2×ln(L(θ|x))+k×ln(n)BIC=−2×ln(L(θ|x))+k×ln(n); mclust does not
include the negative component.
2) mclust uses mixture models to perform the clustering (i.e., model-based); it's quite different from k-means so I would be careful with the phrasing that it's a "tiny bit different than some of the other k-means cluster approaches" (mainly in what "other" implies here); the process for model selection is briefly described in the mclust manual:
mclust provides a Gaussian mixture fitted to the data by maximum likelihood through the EM algorithm, for the model and number of components selected according to BIC. The corresponding components are hierarchically combined according to an entropy criterion, following the methodology described in the article cited in the references section. The solutions with numbers of classes between the one selected by BIC and one are returned as a clustCombi class object.
It's more useful to see the actual paper for a thorough explanation:
https://www.stat.washington.edu/raftery/Research/PDF/Baudry2010.pdf
or here https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2953822/
The entropy plot provided by mclust is meant to be interpreted like a scree plot for a factor analysis (i.e., by looking for an elbow to determine the optimal number of classes); I would argue scree plots are useful for justifying the choice of number of clusters, and these plots belong in the appendices.
mclust does also return the ICL statistic in addition to BIC, so you could choose to report that as a compromise to the reviewer:
https://cran.r-project.org/web/packages/mclust/vignettes/mclust.html (see the example on how to get it to output the statistics)
3) if you wanted to create a table of the entPlot values, you can extract them like so (from the ?entPlot example):
## Not run:
data(Baudry_etal_2010_JCGS_examples)
# run Mclust to get the MclustOutput
output <- clustCombi(ex4.2, modelNames = "VII")
entPlot(output$MclustOutput$z, output$combiM, reg = c(2,3))
# legend: in red, the single-change-point piecewise linear regression;
# in blue, the two-change-point piecewise linear regression.
# added code to extract entropy values from the plot
combiM <- output$combiM
Kmax <- ncol(output$MclustOutput$z)
z0 <- output$MclustOutput$z
ent <- numeric()
for (K in Kmax:1) {
z0 <- t(combiM[[K]] %*% t(z0))
ent[K] <- -sum(mclust:::xlog(z0))
}
data.frame(`Number of clusters` = 1:Kmax, `Entropy` = round(ent, 3))
Number.of.clusters Entropy
1 1 0.000
2 2 0.000
3 3 0.079
4 4 0.890
5 5 6.361
6 6 20.158
7 7 35.336
8 8 158.008