I have been attempting to graphically produce a scatterplot (similar to figure 1) showing the distance of data points to its multivariate centroid. The data contains two categorical grouping factors (V4 or G8) under the column family(response variable) and 12 predictor variables. The data is called LDA.scores and can be found at the bottom of the page. After splitting the two categorical factors into two separate data frames (coding below figure 1), I used the package adegenet in an attempt to produce two scatterplots similar to figure (1) for each categorical factor to show the actual number of clusters in the data set. I understand this package is used for the analysis of genetic markers, however, I am under the impression that these scatterplots can be produced for any type of multivariate data. I tried to manipulate the data but to no avail. If anyone has a solution in terms of how to produce two figures for each categorical factor showing the 12 clusters (12 parameters measured) to its multivariate centroid, then thank so much. I have followed the tutorial and I do not understand these error or warning messages. It makes no difference if I change column [,1] to a numeric value as specified in the manual All coding and the data is located below figure (1).
Figure 1
Code used to produce a scatterplot after DAPC
#An attempt to create a scatterplot for the categorical factor V4
#Split the data frame into two seperate data frames
Just.V4<-LDA.scores[LDA.scores$Family=="V4",]
Just.G8 <-LDA.scores[LDA.scores$Family=="G8",]
library(adegenet)
x<-LDA.scores[2:13]
Finding the number of clusters
grp<-find.clusters(x, max.n.clust=12, na.action="omit")
At this point the output is a BIC graph requesting how many principal components (PC's) to retain based on the shape of the positive hockey stick curve, showing the eigenvalues
I chose to retain 2 PC's as this is the where the curve is straight before the elbow (Figure 2)
Figure 2
The next step is to chose the number of actual number of clusters in the data set (see figure 3) based on when the negative hockey stick curve reaches its elbow, which appears to be 3 clusters.
figure 3
The next step is the perform the discriminant analysis of principal components
dapc1<-dapc(x, grp$grp)
scatter(dapc1)
I have tried many different combinations and here are some of the error messages
Error in dapc.data.frame(x, grp1$grp1) : Inconsistent length for grp
Warning in find.clusters.data.frame(as.data.frame(x), ...) :
NAs introduced by coercion
Error in if (n.pca >= N) warning("number of retained PCs of PCA is greater than N") :
missing value where TRUE/FALSE needed
Solution
set.seed(1234)
windows(width=10, height=7)
x<-LDA.scores[,2:13]
grp1<-find.clusters(x, max.n.clust=12)
dapc1<-dapc(x, grp1$grp)
After the code started working, the next step was to chose the variance explained by the PCA. I chose 2 PC showing most of the variation in the data before the elbow curve.
Figure 4
Lastly, the last question is to chose the number of linear discriminants to retain. I chose 1 because most of the variance in the data can be explained by the first discriminant
Figure 5
myCol <- c("red","purple","darkgreen")
scatter(dapc1,
posi.da="bottomleft",
bg="white",
pch=17:19,
col=myCol,
inset.solid=0.5,
lwd=9,
lty=3,
cex.lab=2,
txt.leg=paste("Cluster", 1:3),
legend=TRUE)
myInset <- function(){
temp <- dapc1$pca.eig
temp <- 100* cumsum(temp)/sum(temp)
plot(temp, col=rep(c("black","lightgrey"),
c(dapc1$n.pca,1000)), ylim=c(0,100),
xlab="PCA axis", ylab="Cumulated variance (%)",
cex=1, pch=20, type="h", lwd=2)
}
add.scatter(myInset(), posi="bottomright",
inset=c(-0.03,-0.01), ratio=.28,
bg=transp("white"))
Figure 6
Density Plot
scatter(dapc1,1,1, col=myCol, bg="white",
scree.da=FALSE, legend=TRUE, solid=.4)
scatter(dapc1,1,1, col=myCol, bg="white",
scree.da=FALSE, legend=TRUE, solid=.4)
Figure 7
Data called LDA.scores
mydat <- structure(list(Family = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("G8", "v4"), class = "factor"),
Swimming = c(-0.4805568, 0.12600625, 0.06823834, 0.67480139,
0.64591744, 0.21265812, -0.01841352, 0.12600625, -0.2206012,
0.27042603, 0.03935439, -0.45167284, -0.04729748, -0.10506539,
0.0971223, -0.07618143, 0.29930998, 0.01047043, -0.24948516,
-0.04729748, -0.01841352, -0.19171725, -0.4805568, 0.01047043,
-0.42278889, -0.45167284, -0.30725307, 0.24154207, 1.45466817,
-0.01841352, 0.38596185, 0.15489021, -0.04729748, 0.27042603,
-0.07618143, -0.10506539, -0.01841352, 0.01047043, 0.06823834,
-0.16283329, -0.01841352, -0.39390493, -0.04729748, 0.01047043,
0.01047043, 0.06823834, -0.04729748, -0.2206012, -0.16283329,
-0.07618143, -0.2206012, -0.19171725, -0.16283329, -0.2206012,
-0.13394934, -0.27836911, -0.04729748, 0.01047043, 0.12600625,
0.06823834, 0.06823834, 0.32819394, 0.32819394, -0.27836911,
0.18377416, 0.55926557, -0.19171725, -0.19171725, 0.01047043,
-0.19171725, -0.01841352, -0.07618143, -0.13394934, -0.39390493,
-0.04729748, -0.27836911, 0.70368535, 0.29930998, -0.13394934,
0.21265812), Not.Swimming = c(-0.0862927, -0.074481895, -0.056765686,
-0.050860283, -0.050860283, -0.068576492, -0.068576492, 0.05543697,
0.114491, -0.021333268, -0.04495488, 0.008193747, -0.056765686,
0.008193747, 0.037720761, 0.01409915, 0.108585597, -0.074481895,
0.002288344, 0.049531567, 0.043626164, 0.049531567, 0.020004552,
0.008193747, 0.025909955, 0.031815358, 0.049531567, -0.039049477,
-0.003617059, 0.002288344, 0.084963985, -0.080387298, 0.067247776,
0.031815358, 0.037720761, 0.025909955, 0.126301805, 0.031815358,
0.037720761, -0.050860283, -0.039049477, -0.003617059, 0.008193747,
-0.039049477, -0.003617059, 0.008193747, 0.01409915, -0.015427865,
0.020004552, 0.031815358, 0.020004552, -0.033144074, -0.039049477,
-0.009522462, -0.003617059, -0.04495488, -0.050860283, -0.04495488,
-0.068576492, -0.033144074, -0.027238671, -0.068576492, 0.01409915,
0.002288344, 0.025909955, -0.009522462, -0.009522462, 0.025909955,
0.15582882, 0.002288344, -0.04495488, -0.015427865, 0.008193747,
0.037720761, 0.008193747, -0.015427865, -0.056765686, 0.079058582,
-0.056765686, 0.025909955), Running = c(-0.157157188, 0.057316151,
0.064711783, 0.153459372, 0.072107416, 0.057316151, -0.053618335,
0.012942357, -0.03882707, 0.049920519, 0.012942357, -0.075805232,
0.035129254, -0.046222702, 0.109085578, -0.03882707, 0.057316151,
0.020337989, 0.035129254, 0.057316151, 0.005546724, -0.016640173,
-0.142365923, 0.220020063, -0.149761556, -0.134970291, 0.042524886,
0.072107416, 0.064711783, 0.020337989, 0.049920519, 0.020337989,
0.138668107, 0.049920519, 0.020337989, -0.083200864, -0.024035805,
-0.016640173, -0.03882707, -0.03882707, 0.005546724, -0.090596497,
-0.00924454, -0.016640173, -0.075805232, -0.090596497, 0.012942357,
-0.075805232, -0.061013967, -0.03882707, -0.112783394, -0.068409599,
-0.090596497, -0.053618335, -0.075805232, -0.090596497, 0.064711783,
0.012942357, 0.042524886, -0.061013967, -0.061013967, 0.064711783,
0.175646269, -0.068409599, 0.027733621, 0.042524886, -0.03882707,
-0.00924454, 0.027733621, -0.031431438, -0.046222702, -0.031431438,
-0.068409599, -0.120179026, 0.035129254, -0.061013967, 0.39751524,
0.138668107, 0.020337989, 0.035129254), Not.Running = c(-0.438809944,
-0.539013927, -0.539013927, -0.539013927, -0.472211271, -0.071395338,
-0.071395338, 0.296019267, 0.563229889, -0.03799401, 0.195815284,
-0.171599321, -0.305204632, 0.062209973, -0.104796666, 0.095611301,
0.028808645, -0.071395338, 0.329420595, 0.296019267, -0.171599321,
-0.071395338, 0.596631217, 0.062209973, 0.028808645, -0.138197994,
0.095611301, -0.104796666, 0.296019267, 0.028808645, -0.03799401,
-0.33860596, 0.129012629, 0.195815284, -0.03799401, 0.396223251,
0.362821923, -0.138197994, 0.26261794, -0.405408616, -0.205000649,
0.129012629, 0.195815284, -0.205000649, -0.004592683, -0.205000649,
-0.071395338, -0.171599321, -0.104796666, -0.138197994, -0.104796666,
-0.071395338, -0.104796666, -0.03799401, -0.004592683, -0.238401977,
0.028808645, -0.305204632, -0.305204632, -0.271803305, -0.03799401,
-0.372007288, 0.095611301, 0.195815284, 0.162413956, 0.229216612,
0.229216612, 0.396223251, 0.630032545, 0.463025906, 0.496427234,
0.062209973, -0.071395338, 0.229216612, -0.071395338, -0.071395338,
-0.205000649, 0.229216612, -0.305204632, 0.396223251), Fighting = c(-0.67708172,
-0.58224128, -0.11436177, -0.34830152, -0.84568695, -0.32933343,
0.35984044, -0.3251183, 1.51478626, 0.11114773, 0.27975296,
-0.89626852, 0.12379312, 0.66965255, 1.56536783, 0.56427428,
-0.71291033, -0.75927677, -0.75295407, -1.00164679, -1.03958296,
0.82139726, -1.07541157, -1.0311527, -0.98900139, -1.06908888,
-1.20186549, 0.58324237, -0.9700333, 0.22917139, 0.41042201,
-1.11545531, -0.19023412, 0.25446217, -0.05324237, 0.09007207,
1.21129685, 0.62539368, 1.32932051, 0.40199175, 0.44625062,
0.60221046, 0.33665722, -0.63493041, -0.282967, -0.32722587,
-0.11646933, -0.10171637, 0.13643851, -0.57802615, 0.05002833,
-0.1607282, -0.29139726, 0.13222338, -0.41152848, 0.68229794,
-0.24292325, -0.11646933, -0.21341734, -0.24292325, -0.24292325,
0.09007207, -0.34197883, -0.30825778, -0.08696342, -0.8119659,
0.49683219, -0.13754498, -0.4831857, 0.39988418, 0.90148474,
0.28396809, 1.05322945, 1.24923303, 0.47154141, 1.27873894,
0.05002833, 1.54218461, 0.74763247, 0.11747042), Not.Fighting = c(-0.097624192,
-0.160103675, -0.092996082, -0.234153433, -0.136963126, -0.15778962,
-0.15778962, -0.023574435, 0.00188017, -0.224897213, -0.109194467,
-0.069855533, -0.123078796, -0.111508522, -0.143905291, -0.099938247,
-0.118450687, 1.519900201, 0.177748344, 0.108326696, 0.652129604,
0.638245274, -0.072169588, 0.087500202, -0.18093017, -0.146219346,
-0.049029039, -0.125392851, -0.134649071, -0.060599313, -0.086053918,
-0.197128554, -0.083739863, -0.092996082, 0.844196163, 0.055103433,
1.971140911, -0.111508522, -0.224897213, -0.187872334, -0.160103675,
-0.194814499, -0.053657149, -0.206384774, 0.108326696, -0.164731785,
0.187004564, 0.025020719, 0.057417488, 0.434608441, 0.057417488,
0.073615872, -0.035144709, -0.051343094, -0.134649071, -0.185558279,
0.013450444, -0.134649071, -0.215640993, -0.185558279, -0.005061995,
-0.238781543, -0.099938247, -0.16704584, -0.208698829, 0.048161268,
0.048161268, -0.037458764, 0.16154996, 0.031962884, -0.102252302,
-0.123078796, -0.139277181, -0.208698829, -0.118450687, -0.072169588,
-0.044400929, -0.030516599, -0.132335016, -0.037458764),
Resting = c(0.01081204879, -0.03398160805, 0.057108797, -0.04063432116,
-0.13084281035, -0.02997847693, 0.12732080268, -0.1028170581,
0.08155320398, -0.17932134171, -0.14338902206, -0.02058415581,
-0.11528274705, -0.11764091337, 0.04389156236, 0.01399844913,
-0.05755560242, 0.04711630687, 0.0158428036, 0.093485909,
0.09677967302, 0.02053612974, -0.03608286844, 0.07805238146,
-9.686695e-05, -0.02285413055, -0.00424187149, 0.01446241356,
0.03187450017, 0.11323315542, -0.01171898422, -0.06499053655,
-0.07758659568, -0.07399758157, -0.11503350996, 0.02167111711,
0.01904454162, 0.05768779393, 0.05555202379, -0.01031175326,
-0.00458313459, 0.17430774591, 0.00481502094, -0.00928412956,
0.09047589183, 0.08917985896, -0.05671203072, -0.05333390954,
0.08541446168, 0.10140397965, -0.02509342995, -0.0369877908,
0.04609635201, 0.06524159499, 0.0845977309, -0.03239032508,
-0.03208740616, 0.06264952925, 0.05241547086, -0.03437271856,
-0.03437271856, -0.06747523863, -0.01270059491, 0.10014629095,
-0.02872845706, -0.00950652573, 0.04867308008, 0.02486518629,
-0.05951115497, -0.02353665674, -0.01967923345, -0.10148651548,
-0.00480936518, -0.00098261723, -0.13970798195, -0.00286148145,
-0.05492902692, 0.10732815358, 0.11660744219, -0.02016620439
), Not.Resting = c(-0.77046287, 0.773856776, -2.593072768,
-2.837675606, -1.680828329, -0.947623773, -0.947623773, -2.607366431,
-0.637055341, -1.818396455, 2.170944974, -0.658126752, -0.808243774,
2.377766908, 2.111220276, -0.322326312, 2.218858946, 3.920878638,
-0.304945754, 1.038591535, 1.752268128, 0.907465624, 1.137774798,
-3.663486997, 2.350924346, 0.067293462, -1.898454393, -2.497647463,
-4.471716512, -1.465081244, -0.232806371, -3.043893581, -2.323908986,
1.437404886, 1.079056696, 1.110865131, 1.404724068, -1.706664294,
0.736746935, -0.005516985, 1.727170333, 1.685228831, 1.836016918,
0.46617392, 1.697173771, 1.057314221, 0.933704227, 0.482480775,
0.680713089, 0.090780703, 0.680713089, -0.982921741, -2.281900378,
0.97208909, 0.027767791, -0.1628815, -0.530221948, -0.385741863,
-0.972251823, 0.002267358, -1.134447998, 0.626424009, -0.722750217,
-0.382722075, -0.356550578, -1.851614124, -1.851614124, 1.731465143,
0.254319006, 2.043778341, -0.28991392, 1.386940871, 0.054207713,
0.594212936, 1.551821303, 3.100704184, 0.327263666, -1.055195336,
-1.134447998, 1.730726972), Hunting = c(-0.67708172, -0.58224128,
-0.11436177, -0.34830152, -0.84568695, -0.32933343, 0.35984044,
-0.3251183, 1.51478626, 0.11114773, 0.27975296, -0.89626852,
0.12379312, 0.66965255, 1.56536783, 0.56427428, -0.71291033,
-0.75927677, -0.75295407, -1.00164679, -1.03958296, 0.82139726,
-1.07541157, -1.0311527, -0.98900139, -1.06908888, -1.20186549,
0.58324237, -0.9700333, 0.22917139, 0.41042201, -1.11545531,
-0.19023412, 0.25446217, -0.05324237, 0.09007207, 1.21129685,
0.62539368, 1.32932051, 0.40199175, 0.44625062, 0.60221046,
0.33665722, -0.63493041, -0.282967, -0.32722587, -0.11646933,
-0.10171637, 0.13643851, -0.57802615, 0.05002833, -0.1607282,
-0.29139726, 0.13222338, -0.41152848, 0.68229794, -0.24292325,
-0.11646933, -0.21341734, -0.24292325, -0.24292325, 0.09007207,
-0.34197883, -0.30825778, -0.08696342, -0.8119659, 0.49683219,
-0.13754498, -0.4831857, 0.39988418, 0.90148474, 0.28396809,
1.05322945, 1.24923303, 0.47154141, 1.27873894, 0.05002833,
1.54218461, 0.74763247, 0.11747042), Not.Hunting = c(-0.097624192,
-0.160103675, -0.092996082, -0.234153433, -0.136963126, -0.15778962,
-0.15778962, -0.023574435, 0.00188017, -0.224897213, -0.109194467,
-0.069855533, -0.123078796, -0.111508522, -0.143905291, -0.099938247,
-0.118450687, 1.519900201, 0.177748344, 0.108326696, 0.652129604,
0.638245274, -0.072169588, 0.087500202, -0.18093017, -0.146219346,
-0.049029039, -0.125392851, -0.134649071, -0.060599313, -0.086053918,
-0.197128554, -0.083739863, -0.092996082, 0.844196163, 0.055103433,
1.971140911, -0.111508522, -0.224897213, -0.187872334, -0.160103675,
-0.194814499, -0.053657149, -0.206384774, 0.108326696, -0.164731785,
0.187004564, 0.025020719, 0.057417488, 0.434608441, 0.057417488,
0.073615872, -0.035144709, -0.051343094, -0.134649071, -0.185558279,
0.013450444, -0.134649071, -0.215640993, -0.185558279, -0.005061995,
-0.238781543, -0.099938247, -0.16704584, -0.208698829, 0.048161268,
0.048161268, -0.037458764, 0.16154996, 0.031962884, -0.102252302,
-0.123078796, -0.139277181, -0.208698829, -0.118450687, -0.072169588,
-0.044400929, -0.030516599, -0.132335016, -0.037458764),
Grooming = c(0.01081204879, -0.03398160805, 0.057108797,
-0.04063432116, -0.13084281035, -0.02997847693, 0.12732080268,
-0.1028170581, 0.08155320398, -0.17932134171, -0.14338902206,
-0.02058415581, -0.11528274705, -0.11764091337, 0.04389156236,
0.01399844913, -0.05755560242, 0.04711630687, 0.0158428036,
0.093485909, 0.09677967302, 0.02053612974, -0.03608286844,
0.07805238146, -9.686695e-05, -0.02285413055, -0.00424187149,
0.01446241356, 0.03187450017, 0.11323315542, -0.01171898422,
-0.06499053655, -0.07758659568, -0.07399758157, -0.11503350996,
0.02167111711, 0.01904454162, 0.05768779393, 0.05555202379,
-0.01031175326, -0.00458313459, 0.17430774591, 0.00481502094,
-0.00928412956, 0.09047589183, 0.08917985896, -0.05671203072,
-0.05333390954, 0.08541446168, 0.10140397965, -0.02509342995,
-0.0369877908, 0.04609635201, 0.06524159499, 0.0845977309,
-0.03239032508, -0.03208740616, 0.06264952925, 0.05241547086,
-0.03437271856, -0.03437271856, -0.06747523863, -0.01270059491,
0.10014629095, -0.02872845706, -0.00950652573, 0.04867308008,
0.02486518629, -0.05951115497, -0.02353665674, -0.01967923345,
-0.10148651548, -0.00480936518, -0.00098261723, -0.13970798195,
-0.00286148145, -0.05492902692, 0.10732815358, 0.11660744219,
-0.02016620439), Not.Grooming = c(-0.77046287, 0.773856776,
-2.593072768, -2.837675606, -1.680828329, -0.947623773, -0.947623773,
-2.607366431, -0.637055341, -1.818396455, 2.170944974, -0.658126752,
-0.808243774, 2.377766908, 2.111220276, -0.322326312, 2.218858946,
3.920878638, -0.304945754, 1.038591535, 1.752268128, 0.907465624,
1.137774798, -3.663486997, 2.350924346, 0.067293462, -1.898454393,
-2.497647463, -4.471716512, -1.465081244, -0.232806371, -3.043893581,
-2.323908986, 1.437404886, 1.079056696, 1.110865131, 1.404724068,
-1.706664294, 0.736746935, -0.005516985, 1.727170333, 1.685228831,
1.836016918, 0.46617392, 1.697173771, 1.057314221, 0.933704227,
0.482480775, 0.680713089, 0.090780703, 0.680713089, -0.982921741,
-2.281900378, 0.97208909, 0.027767791, -0.1628815, -0.530221948,
-0.385741863, -0.972251823, 0.002267358, -1.134447998, 0.626424009,
-0.722750217, -0.382722075, -0.356550578, -1.851614124, -1.851614124,
1.731465143, 0.254319006, 2.043778341, -0.28991392, 1.386940871,
0.054207713, 0.594212936, 1.551821303, 3.100704184, 0.327263666,
-1.055195336, -1.134447998, 1.730726972), Other = c(0.019502286,
-0.290451956, 0.359948884, 0.557840914, 0.117453376, 0.126645924,
0.126645924, 0.196486873, 0.152780228, 0.354469789, -0.261430968,
0.176448238, -0.007374708, -0.557848621, -0.213674557, -0.005819262,
-0.470070992, -0.786078864, 0.006063789, -0.27184265, -0.349418792,
-0.338096262, -0.165119403, 0.346566439, -0.344191931, 0.074321265,
0.179825379, 0.278407054, 0.593125727, 0.199177375, -0.058900625,
0.633875622, 0.428150308, -0.206023441, -0.436958199, -0.291839246,
-0.907641911, 0.448567295, -0.127186127, 0.024715134, -0.41634503,
-0.330697382, -0.469720666, -0.047494017, -0.301732446, -0.138901021,
0.098101379, -0.002063769, -0.02832419, 0.071630763, -0.02832419,
0.295110588, 0.347112947, -0.083577573, -0.036886152, 0.189045953,
0.467596992, 0.303378276, 0.218879697, 0.092005711, 0.27011134,
-0.012909856, 0.262292068, 0.107125772, 0.123422927, 0.299426602,
0.299426602, -0.326871824, -0.022088391, -0.428508341, -0.014675497,
-0.114462294, 0.087227267, -0.031519161, -0.159318008, -0.397875854,
0.101520559, 0.244481505, 0.529968994, -0.32661959)), .Names = c("Family",
"Swimming", "Not.Swimming", "Running", "Not.Running", "Fighting",
"Not.Fighting", "Resting", "Not.Resting", "Hunting", "Not.Hunting",
"Grooming", "Not.Grooming", "Other"), class = "data.frame", row.names = c(NA,
-80L))