Python convolution with histogram and Gaussian

Question

I have a simulated signal which is displayed as an histogram. I want to emulate the real measured signal using a convolution with a Gaussian with a specific width, since in the real experiment a detector has a certain uncertainty in the measured channels.

I have tried to do a convolution using np.convolve as well as scipy.signal.convolve but can't seem to get the filtering correctly. Not only the expected shape is off, which would be a slightly smeared version of the histogram and the x-axis e.g. energy scale is off aswell.

I tried defining my Gaussian with a width of 20 keV as:

gauss = np.random.normal(0, 20000, len(coincidence['esum']))
hist_gauss = plt.hist(gauss, bins=100)[0]

where len(coincidence['esum']) is the length of my coincidencedataframe column.This column I bin using:

counts = plt.hist(coincidence['esum'], bins=100)[0]

Besides this approach to generate a suitable Gaussian I tried scipy.signal.gaussian(50, 30000) which unfortunately generates a parabolic looking curve and does not exhibit the characteristic tails.

I tried doing the convolution using both coincidence['esum'] and counts with the both Gaussian approaches. Note that when doing a simple convolution with the standard example according to Finding the convolution of two histograms it works without problems.

Would anyone know how to do such a convolution in python? I exported the column of coincidende['esum'] that I use for my histogram to a pastebin, in case anyone is interested and wants to recreate it with the specific data https://pastebin.com/WFiSBFa6

Bob · Accepted Answer · 2021-08-13T09:44:48.373

As you may be aware, doing the convolution of the two histograms with the same bin size will give the histogram of the result of adding each element of one of the samples with each elements of the other of the samples.

I cannot see exactly what you are doing. One important thing that you seem to not be doing is to make sure that the bins of the histograms have the same width, and you have to take care of the position of the edges of the second bin.

In code we have

def hist_of_addition(A, B, bins=10, plot=False):
    A_heights, A_edges = np.histogram(A, bins=bins)
    # make sure the histogram is equally spaced
    assert(np.allclose(np.diff(A_edges), A_edges[1] - A_edges[0]))
    # make sure to use the same interval
    step = A_edges[1] - A_edges[0]
    
    # specify parameters to make sure the histogram of B will
    # have the same bin size as the histogram of A
    nBbin = int(np.ceil((np.max(B) - np.min(B))/step))
    left = np.min(B)
    B_heights, B_edges = np.histogram(B, range=(left, left + step * nBbin), bins=nBbin)
    
    # check that the bins for the second histogram matches the first
    assert(np.allclose(np.diff(B_edges), step))
    
    C_heights = np.convolve(A_heights, B_heights)
    C_edges = B_edges[0] + A_edges[0] + np.arange(0, len(C_heights) + 1) * step
    
    if plot:
        plt.figure(figsize=(12, 4))
        plt.subplot(131)
        plt.bar(A_edges[:-1], A_heights, step)
        plt.title('A')
        plt.subplot(132)
        plt.bar(B_edges[:-1], B_heights, step)
        plt.title('B')
        plt.subplot(133)
        plt.bar(C_edges[:-1], C_heights, step)
        plt.title('A+B')
    return C_edges, C_heights

Then

A = -np.cos(np.random.rand(10**6))
B = np.random.normal(1.5, 0.025, 10**5)
hist_of_addition(A, B, bins=100, plot=True);

Gives

Hey @Bob, thanks a lot for your answer! The making sure of equidistant bins and same steps interval: Isn't it redundant with the standard use of `np.hist()` as it's used? Are there some edge cases where it wouldn't do that. I thought that the default would always account for this. My second question regarding the `noise` variable: How did you define it and what is it's use there? Also: You are right and probably not binning my hists correctly was my main issue, altough I tried it to simply set the binning of my Gaussian and the data both to let's say `bins=60` in `plt.hist()`. — Cheeky Leo, Aug 13 '21 at 08:45
`np.histogram` or `plt.hist` will select the step automatically so that it fits all the data in the given number of bins, if it was fixed there would be no reason to return edges. — Bob, Aug 13 '21 at 09:47
The noise there was a mistake, when I copied it into the function to give you I renamed noise to B and I forgot a few instances. — Bob, Aug 13 '21 at 09:47
Alright, thanks a lot for the explanation and help, the way you implemented it is exactly how I thought it should work. My only issue now would be to have a signal amplitude of one, such as in `scipy.signal.windows.gaussian` so that the counts are not multiplied as they are now. Setting `density=True` in ' np.random.normal' only sets the integral equal to one. And feeding in the windows.gaussian doesn't work as it already is binned on it's own — Cheeky Leo, Aug 13 '21 at 12:05

Python convolution with histogram and Gaussian

1 Answers1