How do I isolate a calculation of a function within a python 'for loop' to avoid a broadcasting shape conflict?

Question

I am trying to calculate a series of Gaussian curves for peak coordinates in an IR spectrum. The script to calculate a single peak over a frequency range (X) of 0 to 4000 1/cm functions normally. However when I try to iterate over a range of 75 Frequency and Intensity coordinates I get a broadcast shape conflict between the 4001 'x' values and the 75 peak coordinate pairs. Is there a way to isolate the calculation so that it behaves like a series of independent calculations and thereby avoids the conflict?

Here is my code and error traceback:

import numpy as np

def gaussian(intens, mu):
    x = np.arange(4001)
    sig = 50
    return intens*np.exp(-np.power(x-mu, 2.)/(2*np.power(sig, 2.)))

results = np.empty((4001, 1), float)

for i in range(75):
    mu = np.array([106.2516, 169.2317, 179.4433, 210.1843, 225.1875, 237.6963, 261.1454,
    290.3952, 298.8429, 383.1141, 394.5482, 415.7989, 474.0785, 522.2687,
    555.9868, 571.7233, 617.1713, 646.9524, 712.1052, 757.1555, 839.7896,
    862.2479, 874.9923, 927.4888, 948.9697, 951.0036, 964.3596, 969.371,
    1008.6015, 1039.7932, 1044.8249, 1063.0541, 1107.298, 1127.9082, 1155.2848,
    1180.83, 1196.411, 1225.1961, 1234.4729, 1256.5558, 1278.3917, 1284.0116,
    1311.6421, 1338.709, 1346.252, 1360.011, 1434.1602, 1439.0059, 1455.3892,
    1490.6434, 1512.7327, 1517.3906, 1521.4376, 1525.9011, 1531.1185, 1540.3454,
    1546.1395, 1554.7932, 1841.6486, 3045.7824, 3050.0779, 3053.1525, 3064.5046,
    3070.2651, 3073.4956, 3094.2865, 3097.3753, 3101.0081, 3107.7236, 3108.5122,
    3115.0888, 3117.7676, 3123.2296, 3127.9553, 3141.7127])
    intens = np.array([3.609400e+00, 6.870000e-02, 1.425000e-01, 1.908000e-01, 2.848000e-01,
    9.040000e-01, 7.114000e-01, 3.850000e-01, 1.899100e+00, 7.697000e-01,
    1.484000e-01, 1.223400e+00, 5.366000e-01, 4.554700e+00, 2.007100e+00,
    8.798000e-01, 9.361000e-01, 1.767700e+00, 4.380000e-01, 6.543100e+00,
    4.705000e-01, 1.423900e+00, 5.475000e-01, 1.230200e+00, 3.059800e+00,
    4.872000e-01, 1.293400e+00, 2.782900e+00, 5.430000e-02, 1.592800e+00,
    2.582030e+01, 2.047560e+01, 1.544500e+00, 4.941600e+00, 1.135200e+00,
    6.229000e-01, 3.967100e+00, 1.082100e+00, 5.126800e+00, 3.136400e+00,
    3.190000e-02, 3.438700e+00, 6.669500e+00, 2.266600e+00, 1.033200e+00,
    4.739000e+00, 4.292300e+00, 4.469500e+00, 6.858500e+00, 8.952200e+00,
    2.593600e+00, 6.386200e+00, 4.342300e+00, 2.799900e+00, 1.920900e+00,
    3.788000e-01, 4.900100e+00, 4.086800e+00, 2.093403e+02, 1.231370e+01,
    1.935290e+01, 3.692450e+01, 2.791320e+01, 1.315910e+01, 2.868290e+01,
    2.371370e+01, 1.425640e+01, 4.406400e+00, 7.293400e+00, 5.097790e+01,
    4.594300e+01, 3.229710e+01, 1.685690e+01, 2.933100e+01, 2.938250e+01])
    g_calc = map(gaussian(intens, mu), zip(intens, mu))
    results = vstack(results, g_calc)
results

---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

<ipython-input-29-c890381ce491> in <module>()
 35      2.371370e+01, 1.425640e+01, 4.406400e+00, 7.293400e+00, 5.097790e+01,
 36      4.594300e+01, 3.229710e+01, 1.685690e+01, 2.933100e+01, 2.938250e+01])
---> 37     g_calc = map(gaussian(intens, mu), zip(intens, mu))
 38     results = vstack(results, g_calc)
 39 results

<ipython-input-29-c890381ce491> in gaussian(intens, mu)
  4     x = np.arange(4001)
  5     sig = 50
----> 6     return intens*np.exp(-np.power(x-mu, 2.)/(2*np.power(sig, 2.)))
  7 
  8 results = np.empty((4001, 1), float)

ValueError: operands could not be broadcast together with shapes (4001,) (75,)

BenBoulderite · Accepted Answer · 2018-05-25T17:09:20.187

1

If I understand correctly your question, what you want to compute is:

(If this is right, I would suggest to reformulate the title, as this is really not specific to gaussians.)

For these kind of operations, I use numpy.meshgrid extensively. Here, it basically creates 2d arrays where one dimension is your frequency grid (the j index), and the other dimension corresponds to the different peaks (the i index). Then, you can efficiently invoke all the numpy machinery for arrays on those. See if the code below produces the output you expect:

import numpy as np
import matplotlib.pyplot as plt

intens = np.array([0.1, 0.22, 0.13, 0.51, 0.4])
freq_0 = np.array([500.3, 123.4, 1023.6, 2562.45, 3126.2])
sigmaf = np.array([10.3,   20.4,  5.6,    40.5, 26.2])

freq_mesh = np.linspace(0.0,4000.0, num=4001, endpoint=True)

[I, FM] = np.meshgrid(intens, freq_mesh)
[F0,FM] = np.meshgrid(freq_0, freq_mesh)
[SF,FM] = np.meshgrid(sigmaf, freq_mesh)

signal_2d_arr = I*np.exp(-(FM-F0)**2/(2*SF**2))

spectrum = np.sum(signal_2d_arr, axis = 1)

plt.figure()
plt.plot(freq_mesh, spectrum)
plt.show()

edited May 25 '18 at 17:09

answered May 25 '18 at 16:58

BenBoulderite

336
1
9

Fantastic, just what I was looking for and even sorted out how to do the plotting. I was thinking that I would have to take the maximum value in each column and plot that. But this does it all in one go. It is a pity that I cannot post the plot here in the comment. I think that this is also the answer to this question: https://stackoverflow.com/questions/35461477/how-to-plot-a-gaussian-function-on-python. – S. OConnor May 25 '18 at 17:57
Should you need the maximum over `i`, you would replace the `np.sum` line with: `spectrum = np.amax(signal_2d_arr, axis = 1)` – BenBoulderite May 25 '18 at 18:21
I'd normally advocate just spelling it `np.max` – Eric May 25 '18 at 18:26
Tried both np.max and np.amax and both spellings produce identical results. Thanks for the suggestion. – S. OConnor May 25 '18 at 19:17

How do I isolate a calculation of a function within a python 'for loop' to avoid a broadcasting shape conflict?

1 Answers1