I am trying to implement Kernel K Means clustering with the kkmeans()
function from the kernlab
R package. My problem is that my code returns the expected output when I specify some numbers of clusters with the function's clusters
argument, but throws an error for other numbers of clusters:
Error in if (sum(abs(dc)) < 1e-15) break : missing value where TRUE/FALSE needed
My guess is that this is a convergence issue since the error seems to arise when I increase the number of clusters, but this would be surprising since I have many more rows than the number of clusters I'm specifying. While I can specify 10 clusters with success with an 8000x3 matrix, I receive an error with 100 clusters. Similarly, I can specify 5 clusters but not 10 with a 50-row subset of that data.
Below is a reproducible minimal example where my code replicates the success and the error.
Error if centers = 10
kernlab::kkmeans(mymat, centers=10)
#> Using automatic sigma estimation (sigest) for RBF or laplace kernel
#> Error in if (sum(abs(dc)) < 1e-15) break: missing value where TRUE/FALSE needed
No error if centers = 5
kernlab::kkmeans(mymat, centers=5)
#> Using automatic sigma estimation (sigest) for RBF or laplace kernel
#> Spectral Clustering object of class "specc"
#>
#> Cluster memberships:
#>
#> 1 1 1 1 2 1 1 3 3 5 5 5 3 2 2 2 4 4 3 3 5 2 2 5 5 5 5 5 5 2 4 3 3 3 2 2 5 3 3 5 5 4 4 4 3 1 4 2 5 3
#>
#> Gaussian Radial Basis kernel function.
#> Hyperparameter : sigma = 0.756590498067127
#>
#> Centers:
#> [,1] [,2] [,3]
#> [1,] 15.75871 -16.69486 191.5841
#> [2,] 16.74850 -21.94730 186.8914
#> [3,] 15.99483 -18.95892 190.2622
#> [4,] 15.45729 -18.13571 191.9611
#> [5,] 16.69136 -22.19600 187.0055
#>
#> Cluster size:
#> [1] 7 10 12 7 14
#>
#> Within-cluster sum of squares:
#> [1] 301006.7 443237.8 607889.4 305777.1 685823.5
Example data (50x3 matrix)
mymat <- structure(c(15.9390001296997, 15.9079999923706, 16.087999343872,
15.7930002212524, 15.9619998931884, 15.6129999160766, 15.7550001144409,
16.7740001678466, 16.9080009460449, 17.0769996643066, 16.3640003204345,
16.5960006713867, 16.579999923706, 16.4570007324218, 16.2320003509521,
16.1639995574951, 15.6180000305175, 15.5109996795654, 15.5120000839233,
15.628999710083, 16.9950008392333, 17.3530006408691, 17.2229995727539,
16.8910007476806, 17.1800003051757, 17.1709995269775, 16.9860000610351,
16.704999923706, 16.273000717163, 15.8830003738403, 15.6230001449584,
15.333999633789, 15.3839998245239, 15.3870000839233, 17.1119995117187,
17.6200008392333, 16.8349990844726, 16.4969997406005, 16.2479991912841,
16.1259994506835, 15.8059997558593, 15.378999710083, 15.4320001602172,
15.2100000381469, 15.2519998550415, 15.2150001525878, 15.4280004501342,
17.4790000915527, 16.6739997863769, 16.4330005645751, -16.6299991607666,
-16.9529991149902, -17.5610008239746, -17.8290004730224, -18.6200008392333,
-17.1079998016357, -16.25, -21.716999053955, -21.1219997406005,
-21.8209991455078, -20.1840000152587, -20.0450000762939, -20.9599990844726,
-19.5240001678466, -18.6590003967285, -19.4379997253417, -18.6280002593994,
-18.0669994354248, -16.204999923706, -15.5830001831054, -23.9489994049072,
-23.57200050354, -24.3969993591308, -23.2880001068115, -22.6019992828369,
-23.2329998016357, -22.5979995727539, -22.6140003204345, -20.8059997558593,
-19.4300003051757, -19.4729995727539, -17.5690002441406, -16.8110008239746,
-15.2930002212524, -25.2509994506835, -24.7649993896484, -24.8080005645751,
-21.9939994812011, -21.5189990997314, -20.329999923706, -20.25,
-19.1380004882812, -18.6180000305175, -18.5900001525878, -16.1620006561279,
-14.5329999923706, -14.4359998703002, -25.8169994354248, -24.2159996032714,
-22.57200050354, 190.996994018554, 190.996002197265, 190.18699645996,
191.039993286132, 190.205993652343, 191.919006347656, 191.766006469726,
187.14599609375, 186.889007568359, 186.225997924804, 188.60400390625,
187.932006835937, 187.837005615234, 188.453002929687, 189.382995605468,
189.360000610351, 191.25, 191.845001220703, 192.580001831054,
192.414993286132, 185.358001708984, 184.570999145507, 184.595993041992,
186.091995239257, 185.613998413085, 185.25, 186.235000610351,
187.003005981445, 188.744995117187, 190.169998168945, 190.921005249023,
192.628997802734, 192.768005371093, 193.281997680664, 184.602996826171,
183.796005249023, 185.414001464843, 187.811004638671, 188.615005493164,
189.263000488281, 190.167007446289, 191.781997680664, 191.837997436523,
192.582000732421, 193.399002075195, 194.184005737304, 193.509994506835,
183.776000976562, 186.173995971679, 187.774993896484), dim = c(50L,
3L), dimnames = list(NULL, c("x", "y", "z")))