1

I have 1-dimensional numpy array (arr0) with different values. I want to create a new array of elements, where each element is a couple (indexes and/or values) of one element to its closest one, considering that the absolute value of the difference (distance) of the couple is lower than a set threshold.

At each step (coupling) I would like to remove the elements already coupled.

arr0 = [40, 55, 190, 80, 175, 187] #My original 1D array
threshold = 20 #Returns elements if "abs(el_1 - el_2)<threshold"
#For each couple found, the code should remove the couple from the array and then go on with the next couple
result_indexes = [[0, 1], [2, 5]]
result_value = [[40, 55], [190, 187]]
Sav
  • 142
  • 1
  • 17

2 Answers2

1

You could imagine something like this, using the sklearn.metrics.pairwise_distances to compute all pairwise distances:

from sklearn.metrics import pairwise_distances

# Get all pairwise distances
distances = pairwise_distances(np.array(arr0).reshape(-1,1),metric='l1')
# Sort the neighbors by distance for each element 
neighbors_matrix = np.argsort(distances,axis=1)

result_indexes = []
result_values = []

used_indexes = set()

for i, neighbors in enumerate(neighbors_matrix):

    # Skip already used indexes
    if i in used_indexes:
        continue

    # Remaining neighbors
    remaining = [ n for n in neighbors if n not in used_indexes and n != i]
    # The closest non used neighbor is in remaining[0] is not empty
    if len(remaining) == 0:
        continue

    if distances[i,remaining[0]] < threshold:
        result_indexes.append((i,remaining[0]))
        result_values.append((arr0[i],arr0[remaining[0]]))

        used_indexes = used_indexes.union({i,remaining[0]})

On your example, it yields:

>> result_indexes
[(0, 1), (2, 4)]
>> result_values
[(40, 55), (190, 175)]
Nakor
  • 1,484
  • 2
  • 13
  • 23
  • This works for coupling. However, it does not delete/avoid the same elements to be coupled again if other similar elements exist. – Sav Jul 15 '19 at 03:58
  • I edited my question as well because probably it was not clear. The function do not work properly, and couple the first element with the first one with a threshold less than 20. However, I need to find the closest element, couple it to it and then assure than both of them won't be coupled again. – Sav Jul 15 '19 at 04:24
  • Oh my bad, I misunderstood. So the order in which you're considering the elements in your array will change the result, correct? – Nakor Jul 15 '19 at 04:27
  • Yes, correct. I also tried to use KDtree to calculate the "distances" but it didn't work. – Sav Jul 15 '19 at 04:35
  • I updated it, I hope this time I understood correctly ^^ – Nakor Jul 15 '19 at 04:46
0
arr0s = sorted(arr0)
n = len(arr0)
z = []
x = 0 
while x<n-2:
    if arr0s[x+1]-arr0s[x] < 20:
        if arr0s[x+1]-arr0s[x] < arr0s[x+2]-arr0s[x+1]:
            z.append([arr0s[x], arr0s[x+1]])
            x+=2 
        else:
            z.append([arr0s[x+1], arr0s[x+2]])
            x+=3
    else:
        x+=1 
    result_indexes = [[arr0.index(i[0]), arr0.index(i[1])]  for i in z] 

    for i, j in enumerate(result_indexes):
        if j[0]>j[1]:
            result_indexes[i] = [j[1], j[0]]
    result_value = [[arr0[i[0]], arr0[i[1]]] for i in result_indexes]
print(result_indexes)
#[[0, 1], [2, 5]]
print(result_value)
#[[40, 55], [190, 187]]
ComplicatedPhenomenon
  • 4,055
  • 2
  • 18
  • 45