1

I have the following code which calculates the number of pairs 1 <= i < j <= n such that xs[i] == ys[j]:

def f(xs, ys):
    s = 0
    for j in range(xs.size):
        s += np.sum(xs[:j] == ys[j])
    return s

This is called a couple of times from some other procedure, so I need it to be as fast as possible (at some memory cost).

The sizes of the arrays are > 1e6.

Is there a faster equivalent that makes use of some numpy magic which gets rid of the for loop?

Ehsan
  • 12,072
  • 2
  • 20
  • 33
Alexandru Dinu
  • 1,159
  • 13
  • 24
  • 1
    Can you use Numba? It’s built for this kind of thing. – David Hoffman Aug 22 '20 at 05:47
  • 1
    Try using `np.broadcast_to(xs, (xs.shape[0], xs.shape[0]))` to get a square matrix of repeated xs, then `np.tril` to zero-out elements above the diagonal, then `np.sum(zeroed_matrix == ys)`... not sure about the last part, but play around with it – RichieV Aug 22 '20 at 06:31
  • 1
    `np.sum(zeroed_matrix==ys[:, np.newaxis])` as in [this answer](https://stackoverflow.com/a/33045903/6692898) – RichieV Aug 22 '20 at 06:35

1 Answers1

1

One way of doing it if xs and ys are of the same size:

s = np.triu(xs[:,None]==ys,1).sum()

And if xs and ys are different sizes (according to your code, you only compare same length of ys with xs. In case you want to compare xs with ALL ys, use line above):

s = np.triu((xs[:,None]==ys[:xs.size]),1).sum()

or equivalently:

s = (xs[:,None]==ys)[np.triu_indices(xs.size,1)].sum()

Which compares all elements of a 2-D xs with ys and sum the equals on upper triangle (same as loop's inner line)

If your arrays are too large and you run into memory issue, simply chunk your arrays and use lines above to compare blocks on diagonal plus all of the off-diagonal blocks on upper triangle and add them up.

Ehsan
  • 12,072
  • 2
  • 20
  • 33