0

Good day to you fellow programmer !
Today I would like to do something that I believe is tricky. I have a very large 2D array called tac that basically contains time curve values and a file containing a tuple of coordinates called coor which contains information on where to place these curves in a 3D array. What this set of variables represents is actually a 4D array: the first 3 dimensions represent space dimensions and the fourth is time. The whole thing is stored as is to avoid storing an immense amount of zeros.

I would like to apply, for each time (in other words, each values in the 4th dimension), a gaussian kernel to this set of data. I was able to generate this kernel and to perform the convolution quite easily for a fixed standard deviation for the whole array using scipy.ndimage.convolve. The kernel was created using scipy.signal.gaussian. Here is a brief example of the principle where tac_4d contains the 4D array (stores a lot of data I know... but one problem at the time):

def gaussian_kernel_3d(radius, sigma):
    num = 2 * radius + 1
    kernel_1d = signal.gaussian(num, std=sigma).reshape(num, 1)
    kernel_2d = np.outer(kernel_1d, kernel_1d)
    kernel_3d = np.outer(kernel_1d, kernel_2d).reshape(num, num, num)
    kernel_3d = np.expand_dims(kernel_3d, -1)
    return kernel_3d

g = gaussian_kernel_3d(1, .5)
cag = nd.convolve(tac_4d, g, mode='constant', cval=0.0)

The trick is now to convolve the array with a kernel which standard deviation is different for each SPACE coordinate. In other words, I would have a 3D array std containing standard deviations for each coordinate of the array.

It seems https://github.com/sheliak/varconvolve is the code needed to take care of this problem. However I don't really understand how to use it and quite frankly, I would prefer to come up with a genuine solution. Do you guys see a way to solve this problem?

Thanks in advance !

EDIT

Here is what I hope can be considered MCVE

import numpy as np
from scipy import signal
from scipy import ndimage as nd


def gaussian_kernel_2d(radius, sigma):
    num = 2 * radius + 1
    kernel_1d = signal.gaussian(num, std=sigma).reshape(num, 1)
    kernel_2d = np.outer(kernel_1d, kernel_1d)
    return kernel_2d


def gaussian_kernel_3d(radius, sigma):
    num = 2 * radius + 1
    kernel_1d = signal.gaussian(num, std=sigma).reshape(num, 1)
    kernel_2d = np.outer(kernel_1d, kernel_1d)
    kernel_3d = np.outer(kernel_1d, kernel_2d).reshape(num, num, num)
    kernel_3d = np.expand_dims(kernel_3d, -1)
    return kernel_3d


np.random.seed(0)

number_of_tac = 150
time_samples = 915
z, y, x = 100, 150, 100
voxel_number = x * y * z

# TACs in the right order
tac = np.random.uniform(0, 4, time_samples * number_of_tac).reshape(number_of_tac, time_samples)

arr = np.array([0] * (voxel_number - number_of_tac) + [1] * number_of_tac)
np.random.shuffle(arr)
arr = arr.reshape(z, y, x)
coor = np.where(arr != 0)  # non-empty voxel

# Algorithm to replace TAC in 3D space
nnz = np.zeros(arr.shape)
nnz[coor] = 1
tac_4d = np.zeros((x, y, z, time_samples))
tac_4d[np.where(nnz == 1)] = tac

# 3D convolution for all time
# TODO: find a way to make standard deviation change for each voxel
g = gaussian_kernel_3d(1, 1)  # 3D kernel of std = 1
v = np.random.uniform(0, 1, x * y * z).reshape(z, y, x)  # 3D array of std
cag = nd.convolve(tac_4d, g, mode='constant', cval=0.0)  # convolution
Murnawful
  • 135
  • 1
  • 12
  • There are two ways how you can define the blurring (?) operation with a variable kernel. In the first, input data at a given `x` pushes values to output ranging from `x-a` to `x+a` where the kernel has a width `2a` and the kernel depends on `x`. In the second, each point `x` in the output receives data from `x-a` to `x+a` in the input and the kernel depends on `x`. You need to write down the mathematical definition of what you want to do. – Han-Kwang Nienhuys Jul 17 '20 at 14:38
  • Also, can you please provide a [MCVE](https://stackoverflow.com/help/minimal-reproducible-example) that includes an example of a variable kernel? – Han-Kwang Nienhuys Jul 17 '20 at 14:43
  • Just provided MCVE (I believe it is minimal). The problem is quite generic. – Murnawful Jul 20 '20 at 06:54

1 Answers1

0

Essentially, you have a 4D dataset, shape (nx, ny, nz, nt) that is sparse in (nx, ny, nz) and dense in the nt axis. If (i, j, k) are coordinates of nonzero points in the sparse dimensions, you want to convolve with a Gaussian 3D kernel that has a sigma that depends on (i, j, k).

For example, if there are nonzero points at [1, 2, 5] and [1, 4, 5] with corresponding sigmas 0.1 and 1.0, then the output at coordinates [1, 3, 5] is affected mostly by the [1, 4, 5] point because that one has the largest point spread.

Your question is ambiguous; it could also mean that point [1, 3, 5] has a its own associated sigma, for example 0.5, and pulls data from the two adjacent points with equal weight. I will assume the first definition (sigma values associated with input points, not with output points).

Because the operation is not a true convolution, there is no fast FFT-based method to do the entire operation in one operation. Instead, you have to loop over the sigma values. Fortunately, your example has only 150 nonzero points, so the loop is not too expensive.

Here is an implementation. I keep the data in sparse representation as long as possible.

import scipy.signal
import numpy as np

def kernel3d(mm, sigma):
    """Return (mm, mm, mm) shaped, normalized kernel."""
    g1 = scipy.signal.gaussian(mm, std=sigma)
    g3 = g1.reshape(mm, 1, 1) * g1.reshape(1, mm, 1) * g1.reshape(1, 1, mm)    
    return g3 * (1/g3.sum())

np.random.seed(1)
s = 2 # scaling factor (original problem: s=10)
nx, ny, nz, nt, nnz = 10*s, 11*s, 12*s, 91*s, 15*s
# select nnz random voxels to fill with time series data
randint = np.random.randint
tseries = {} # key: (i, j, k) tuple; value: time series data, shape (nt,)
for _ in range(nnz):
    while True:
        ijk = (randint(nx), randint(ny), randint(nz))
        if ijk not in tseries:
            tseries[ijk] = np.random.uniform(0, 1, size=nt)
            break
        
ijks = np.array(list(tseries.keys())) # shape (nnz, 3)

# sigmas: key: (i, j, k) tuple; value: standard deviation
sigmas = { k: np.random.uniform(0, 2) for k in tseries.keys() }

# output will be stored as dense array, padded to avoid edge issues 
# with convolution.
m = 5 # padding size
cag_4dp = np.zeros((nx+2*m, ny+2*m, nz+2*m, nt))

mm = 2*m + 1 # kernel width
for (i, j, k), tdata in tseries.items():
    kernel = kernel3d(mm, sigmas[(i, j, k)]).reshape(mm, mm, mm, 1)
    # convolution of one voxel by kernel is trivial.
    # slice4d_c has shape (mm, mm, mm, nt).
    slice4d_c = kernel * tdata
    cag_4dp[i:i+mm, j:j+mm, k:k+mm, :] += slice4d_c

cag_4d = cag_4dp[m:-m, m:-m, m:-m, :]

#%%

import matplotlib.pyplot as plt
fig, axs = plt.subplots(2, 2, tight_layout=True)
plt.close('all')
# find a few planes
#ks = np.where(np.any(cag_4d != 0, axis=(0, 1,3)))[0]
ks = ijks[:4, 2]
for ax, k in zip(axs.ravel(), ks):
    ax.imshow(cag_4d[:, :, k, nt//2].T)
    ax.set_title(f'Voxel [:, :, {k}] at time {nt//2}')

fig.show()

for ijk, sigma in sigmas.items():
    print(f'{ijk}: sigma={sigma:.2f}')

Slices from the convolved 4D dataset

Han-Kwang Nienhuys
  • 3,084
  • 2
  • 12
  • 31
  • Thank you very much for your wonderful answer. It works well indeed. Sorry for the ambiguity in the question but as you deduced, it is indeed the input points that should be pulled in the operation, otherwise I believe the order of application of the convolution would have an impact, which should not be the case. Once again thank you for the help ! – Murnawful Jul 21 '20 at 14:14
  • I just realized that the `scipy.signal.convolve` call is redundant; I updated the code (it's more efficient now). Both definitions of the "convolution" are valid and linear, but the one that you picked is vastly more efficient in the implementation. – Han-Kwang Nienhuys Jul 21 '20 at 17:33