1

I have a (big) boolean array and I'm looking for a way to fill True where it merges two sequences of True with minimal length.
For example:

a = np.array([True] *3 + [False] + [True] *4 + [False] *2 + [True] *2)
# a == array([ True,  True,  True, False,  True,  True,  True,  True, False, False,  True,  True])

closed_a = close(a, min_merge_size=2)
# closed_a == array([ True,  True,  True, True,  True,  True,  True,  True, False, False,  True,  True])

Here the False value in index [3] is converted to True because on both sides it has a sequence of at least 2 True elements. Conversely, elements [8] and [9] remain False because the don't have such a sequence on both sides.

I tried using scipy.ndimage.binary_closing with structure=[True True False True True] (and with False in the middle) but it doesn't give me what I need.
Any ideas?

Jon Nir
  • 507
  • 3
  • 15

1 Answers1

0

This one was tough, but I was able to come up with something using itertools and more_itertools. Similar to what you had, essentially, the idea is to take consecutive windows on the array, and just directly check if that window contains the indicator sequence of n * True, False, n * True.

from itertools import chain
from more_itertools import windowed


def join_true_runs(seq, min_length_true=2):
    n = min_length_true
    sentinel = tuple(chain([True] * n, [False], [True] * n))

    indecies = [
        i + n for i, w in enumerate(windowed(seq, 2 * n + 1)) if w == sentinel
    ]
    seq = seq.copy() #optional
    seq[indecies] = True
    return seq

You should probably write some tests to check for corner cases, though it does seem to work on this test array:

arr = np.array([True, True, True, False, True, True, True, False, True, False, True, True, False, False, True, True])

# array is unchanged
assert all(join_true_runs(arr, 4) == arr)

# only position 3 is changed
list(join_true_runs(arr, 3) == arr)

# returns [True,
# True,
# True,
# False,
# True,
# ...
# ]

Of course if you want to mutate the original array instead of returning a copy you can do that too.

T. Hall
  • 123
  • 7