Here is one possible approach, in which I treat the problem directly as an optimisation problem.
Suppose you have a feasible partition of the rows into groups. Not necessarily a good one, but one that does not violate the constraints. For every
group (cluster), you compute the centre. Then you
compute the distances of all points in a group to the
group's centre, and sum them. In this way, you have a
measure of quality of your initial partition.
Now, randomly pick on row, and move it into another
group. You get new solution. Complete the steps
as before, and compare the new solution's quality with
the previous one. If it's better, keep it. If it's
worse, stay with the old solution. Now repeat this
whole procedure for a fixed number of iterations.
This process is called a Local Search. Of course, it is
not guaranteed it will take you to an optimum
solution. But it will likely give you a good
solution. (Note that k-means implementations are
typically stochastic as well, and there is no guaranty
for an "optimal" partition.)
The good thing about a Local Search is that it gives
you much flexibility. For instance, I assumed you
started with a feasible solution. Suppose you make a
random move (i.e. move one row into another cluster),
but now this new cluster is to big. You could now simply discard this new, infeasible solution, and draw a new one.
Here is a code example, really just an outline; but with luck it is useful for you.
set.seed(123)
id<- seq(1:600)
lon <- rnorm(600, 88.5, 0.125)
lat <- rnorm(600, 22.4, 0.15)
demand <- round(rnorm(600, 40, 20))
df<- data.frame(id, lon, lat, demand)
Fix a number of clusters, k.
k <- 5
Start with kmeans and plot the solution.
par(mfrow = c(1, 3))
km <- kmeans(cbind(df$lat, df$lon), centers = k)
cols <- hcl.colors(n = k, "Cold")
plot(df$lon,
df$lat,
type = "p", pch = 19, cex = 0.5,
main = "kmeans")
for (i in seq_len(k)) {
lines(df$lon[km$cluster == i],
df$lat[km$cluster == i],
type = "p", pch = 19,
col = cols[i])
}
Now a Local Search. I use an implementation in package NMOF
(which I maintain).
library("NMOF")
## a random initial solution
x0 <- sample(1:k, length(id), replace = TRUE)
X <- as.matrix(df[, 2:3])
The objective function: it takes a partition x
and computes the sum of distances, for all clusters.
sum_diff <- function(x, X, k, ...) {
groups <- seq_len(k)
d_centre <- numeric(k)
for (g in groups) {
centre <- colMeans(X[x == g, ], )
d <- t(X[x == g, ]) - centre
d_centre[g] <- sum(sqrt(colSums(d * d)))
}
sum(d_centre)
}
The neighbourhood function: it takes a partition and moves
one row into another cluster.
nb <- function(x, k, ...) {
groups <- seq_len(k)
x_new <- x
p <- sample.int(length(x), 1)
g_ <- groups[-x_new[p]]
x_new[p] <- g_[sample.int(length(g_), 1)]
x_new
}
Run the Local Search. I actually use a method called Threshold Accepting, which is based on Local Search, but can move away from local minima. See ?NMOF::TAopt
for references on that method.
sol <- TAopt(sum_diff,
list(x0 = x0,
nI = 20000,
neighbour = nb),
X = as.matrix(df[, 2:3]),
k = k)
Plot the solution.
plot(df$lon,
df$lat,
type = "p", pch = 19, cex = 0.5,
main = "Local search")
for (i in seq_len(k)) {
lines(df$lon[sol$xbest == i],
df$lat[sol$xbest == i],
type = "p", pch = 19,
col = cols[i])
}
Now, one way to add a capacity constraint. We start with a feasible solution.
## CAPACITY-CONSTRAINED
max.demand <- 6600
all(tapply(df$demand, x0, sum) < max.demand)
## TRUE
The constraint is handled in the neighbourhood. If the new solution exceeds the capacity, it is discarded.
nb_constr <- function(x, k, demand, max.demand,...) {
groups <- seq_len(k)
x_new <- x
p <- sample.int(length(x), 1)
g_ <- groups[-x_new[p]]
x_new[p] <- g_[sample.int(length(g_), 1)]
## if capacity is exceeded, return
## original solution
if (sum(demand[x_new == x_new[p]]) > max.demand)
x
else
x_new
}
Run the method and compare the results.
sol <- TAopt(sum_diff,
list(x0 = x0,
nI = 20000,
neighbour = nb_constr),
X = as.matrix(df[, 2:3]),
k = k,
demand = df$demand,
max.demand = max.demand)
plot(df$lon,
df$lat,
type = "p", pch = 19, cex = 0.5,
main = "Local search w/ constraint")
for (i in seq_len(k)) {
lines(df$lon[sol$xbest == i],
df$lat[sol$xbest == i],
type = "p", pch = 19,
col = cols[i])
}
all(tapply(df$demand, sol$xbest, sum) < max.demand)
## TRUE

This is really just an example and could be improved. For instance, the objective function here recomputes the distance of all groups, when it would only need to look at the changed groups.