Optimizing a simple Photon Detection Simulation

Question

I am a medical physics student trying to simulate photon detection - I succeeded (below) but I want to make it better by speeding it up: it currently takes 50 seconds to run and I want it to run in some fraction of that time. I assume someone more knowledgeable in Python could optimize it to complete within less than 10 seconds (without reducing num_photons_detected values). Thank you very much for trying out this little optimization challenge.

from random import seed
from random import random
import random 
import matplotlib.pyplot as plt
import numpy as np

rows, cols = (25, 25)

num_photons_detected = [10**3, 10**4, 10**5, 10**6, 10**7] 

lesionPercentAboveNoiseLevel = [1, 0.20, 0.10, 0.05] 

index_range = np.array([i for i in range(rows)])

for l in range(len(lesionPercentAboveNoiseLevel)):
    pixels = np.array([[0.0 for i in range(cols)] for j in range(rows)])

    for k in range(len(num_photons_detected)): 
        random.seed(a=None, version=2) 
        photons_random_pixel_choice = np.array([random.choice(index_range) for z in range(rows)]) 
        counts = 0
    
        while num_photons_detected[k] > counts:
            for i in photons_random_pixel_choice:
                photons_random_pixel_choice = np.array([random.choice(index_range) for z in range(rows)]) #further ensures random pixel selection
                for j in photons_random_pixel_choice:
                    pixels[i,j] +=1
                    counts +=1
        plt.imshow(pixels, cmap="gray") #in the resulting images/graphs, x is on the vertical and y on the horizontal
        plt.show()

Not likely a performance issue, but re-seeding the RNG each time through a loop is not normally how to get the kind of randomness you want. Most programs don't need to seed at all. — Karl Knechtel, Jan 22 '22 at 05:29
This question would probably be better answered on [code review](https://codereview.stackexchange.com/). That being said, I would suggest using cProfile (or some other profiler, but cProfile is built-in) to investigate which piece of the code is taking the longest to run, to consider your options for optimization. Always profile to see what needs to be optimized. — Kraigolas, Jan 22 '22 at 05:37
Is `pixels[i,j] +=1` intended to use the same value for `i`, for each of 25 `j`s? Or did you just want to pick a lot of random points? — Karl Knechtel, Jan 22 '22 at 05:44
What is the purpose of `for l in range(len(lesionPercentAboveNoiseLevel)):`? I don't see anything in the code that uses `l`, but `lesionPercentAboveNoiseLevel` seems to contain intentionally chosen values. — Karl Knechtel, Jan 22 '22 at 05:49
@KarlKnechtel The pixels[i,j] +=1 simulates the pixel at that location detecting a photon. The "for l" and l itself is used in a piece of the code not shown here because it isn't relevant to the "optimization problem" really (I should have deleted that for loop and related variable to avoid the confusion, my bad). — Corwin of Amber, Jan 22 '22 at 07:00
@Kraigolas thanks for letting me know about Code Review, I signed up! — Corwin of Amber, Jan 22 '22 at 07:30

score 4 · Accepted Answer · answered Jan 22 '22 at 06:26

I think that, aside from efficiency issues, a problem with the code is that it does not select the positions of photons truly at random. Instead, it selects rows numbers, and then for each selected row, it picks column numbers where photons will be observed in that row. As a result, if a row number is not selected, there will be no photons in that row at all, and if the same row is selected several times, there will be many photons in it. This is visible in the produced plots which have a clear pattern of lighter and darker rows:

Assuming that this is unintended and that each pixel should have equal chances of being selected, here is a function generating an array of a given size, with a given number of randomly selected pixels:

import numpy as np

def generate_photons(rows, cols, num_photons):
    rng = np.random.default_rng()
    indices = rng.choice(rows*cols, num_photons)
    np.add.at(pix:=np.zeros(rows*cols), indices, 1)
    return pix.reshape(rows, cols)

You can use it to produce images with specified parameters. E.g.:

import matplotlib.pyplot as plt

pixels = generate_photons(rows=25, cols=25, num_photons=10**4)
plt.imshow(pixels, cmap="gray")
plt.show()

gives:

`np.add.at` is exactly the tool I figured would exist, but had no idea how to look for. Or rather, I was unaware of the `at` method of ufuncs. — Karl Knechtel, Jan 22 '22 at 07:00

score 2 · Answer 2 · answered Jan 22 '22 at 05:44

photons_random_pixel_choice = np.array([random.choice(index_range) for z in range(rows)])

It seems like the goal here is:

Use a pre-made sequence of integers, 0 to 24 inclusive, to select one of those values.
Repeat that process 25 times in a list comprehension, to get a Python list of 25 random values in that range.
Make a 1-d Numpy array from those results.

This is very much missing the point of using Numpy. If we want integers in a range, then we can directly ask for those. But more importantly, we should let Numpy do the looping as much as possible when using Numpy data structures. This is where it pays to read the documentation:

size: int or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned.

So, just make it directly: photons_random_pixel_choice = random.integers(rows, size=(rows,)).

But "AttributeError: module 'random' has no attribute 'integers'" is what the code gives (with importing random) so I don't see how this can work. Please explain. — Corwin of Amber, Jan 22 '22 at 07:24

Optimizing a simple Photon Detection Simulation

2 Answers2