Consider the sorted array a
:
a = np.array([0, 2, 3, 4, 5, 10, 11, 11, 14, 19, 20, 20])
If I specified left and right deltas,
delta_left, delta_right = 1, 1
Then this is how I'd expect the clusters to be assigned:
# a = [ 0 . 2 3 4 5 . . . . 10 11 . . 14 . . . . 19 20
# 11 20
#
# [10--|-12] [19--|-21]
# [1--|--3] [10--|-12] [19--|-21]
# [-1--|--1] [3--|--5] [9--|-11] [18--|-20]
# +--+--|--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--|
# [2--|--4] [13--|-15]
#
# │ ╰──┬───╯ ╰┬─╯ │ ╰┬─╯
# │ cluster 2 Cluster 3 │ Cluster 5
# Cluster 1 Cluster 4
NOTE: Despite the interval [-1, 1]
sharing an edge with [1, 3]
, neither interval includes an adjacent point and therefore do not constitute joining their respective clusters.
Assuming the cluster assignments were stored in an array named clusters
, I'd expect the results to look like this
print(clusters)
[1 2 2 2 2 3 3 3 4 5 5 5]
However, suppose I change the left and right deltas to be different:
delta_left, delta_right = 2, 1
This means that for a value of x
it should be combined with any other point in the interval [x - 2, x + 1]
# a = [ 0 . 2 3 4 5 . . . . 10 11 . . 14 . . . . 19 20
# 11 20
#
# [9-----|-12] [18-----|-21]
# [0-----|--3] [9-----|-12] [18-----|-21]
# [-2-----|--1][2-----|--5] [8-----|-11] [17-----|-20]
# +--+--|--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--|
# [1 ----|--4] [12-----|-15]
#
# ╰─────┬─────╯ ╰┬─╯ │ ╰┬─╯
# cluster 1 Cluster 2 │ Cluster 4
# Cluster 3
NOTE: Despite the interval [9, 12]
sharing an edge with [12, 15]
, neither interval includes an adjacent point and therefore do not constitute joining their respective clusters.
Assuming the cluster assignments were stored in an array named clusters
, I'd expect the results to look like this:
print(clusters)
[1 1 1 1 1 2 2 2 3 4 4 4]