0

I have 2 arrays of sets of signals, both 16x90000 arrays. In other words, 2 arrays with 16 signals in each. I want to perform matched filtering on the signals, row by row, correlating row 1 of array 1 with row 1 of array 2, and so forth. I've tried using scipy's signal.convolve2D but it is extremely slow, taking tens of seconds to convolve even a 2x90000 array. I'm not sure if I am simply implementing wrong, or if there is a more efficient way of achieving what I want. I know the arrays are long, but I feel it should still be achievable. I have a feeling convolve2d is actually convolving to a squared factor higher than I want and convolving rows by columns too but I may be misunderstanding.

My implementation:

A.shape = (16,90000) # an array of 16 signals each 90000 samples long
B.shape = (16,90000) # another array of 16 signals each 90000 samples long

corr = sig.convolve2d(A,B,mode='same')

I haven't had much coffee yet so there's every chance I'm being stupid right now.

Please no for loops.

Mercury
  • 3,417
  • 1
  • 10
  • 35
Jack
  • 107
  • 7
  • 1
    Well, yes. Convolve2d is working along both dimensions. – Mercury Nov 30 '22 at 10:14
  • Suspected as much, so is there an alternative? That only works along row by row @Mercury – Jack Nov 30 '22 at 10:21
  • 1
    A naive solution could be using [scipy.ndimage.convolve1d](https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.convolve1d.html), like `np.vstack([convolve1d(a, b, axis=1) for a, b in zip(A, B)])`. Trying to think if there is something without a loop. This should be faster than the conv2d though. – Mercury Nov 30 '22 at 10:29
  • Also, check if you need convolve or [correlate](https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.correlate.html) for your use case, since you wish to do matched filtering? – Mercury Nov 30 '22 at 11:27
  • Not sure why I was using convolve... Yes I want to be using correlate. @Mercury – Jack Nov 30 '22 at 11:33
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/250014/discussion-between-mercury-and-jack). – Mercury Nov 30 '22 at 11:36

1 Answers1

0

Since you need to correlate the signals row by row, the most basic solution would be:

import numpy as np
from scipy.signal import correlate

# sample inputs: A and B both have n signals of length m

n, m = 2, 5
A = np.random.randn(n, m)
B = np.random.randn(n, m)

C = np.vstack([correlate(a, b, mode="same") for a, b in zip(A, B)])

# [[-0.98455996  0.86994062 -1.1446486  -2.1751074  -0.59270322]
#  [ 1.7945015   1.51317292  1.74286042 -0.57750712 -1.9178488 ]]]

One way to avoid a looped solution could be by bootlegging off a deep learning library, like PyTorch. Torch's Conv1d (though named conv, it effectively performs cross-correlation) can handle this scenario.

import torch
import torch.nn.functional as F

# Convert A and B to torch tensors
P = torch.from_numpy(A).unsqueeze(0) # (1, n, m)
Q = torch.from_numpy(B).unsqueeze(1) # (n, 1, m)

# Use conv1d --- with groups = n
def torch_correlate(A, B, n):
    with torch.no_grad():
        return F.conv1d(A, B, bias=None, stride=1, groups=n, padding="same").squeeze(0).numpy()

R = torch_correlate(P, Q, n)
# [[-0.98455996  0.86994062 -1.1446486  -2.1751074  -0.59270322]
#  [ 1.7945015   1.51317292  1.74286042 -0.57750712 -1.9178488 ]]

However, I believe there shouldn't be any significant difference in the results, since grouping might be using some form of iteration internally as well. (Plus there is an overhead of converting from torch to numpy and back to consider).

I would suggest using the first method generally. Unless if you are working on really large signals, then you could theoretically use the PyTorch version to run it really fast on GPU, which you won't be able to do with the regular scipy one.

Mercury
  • 3,417
  • 1
  • 10
  • 35