0

I am new here so I hope this post is legible/set up correctly. I am having difficulty trying to use curve_fit in python to fit histogram data to a sum of Poisson distributions. The data is a set of photon counts for different numbers of atoms. The set in this code is truncated to 0 atoms (background) 1 atom or 2 atoms. Whether I use a user defined Poisson or the poisson.pmf function from scipy I am unable to get a successful fit without errors.

```python
from scipy.special import factorial
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt
import numpy as np

### User Defined poisson
def poissonian( x, A0, x0, A1, x1, A2, x2 ):
    return (A0 * x0**x * np.exp(-x0) / scipy.special.factorial(x)\
        +A1 * x1**x * np.exp(-x1) / scipy.special.factorial(x)\
        +A2 * x2**x * np.exp(-x2) / scipy.special.factorial(x))

### Built in poisson function
#def poissonian( x, A0, x0, A1, x1, A2, x2 ):
#    return (A0*poisson.pmf(x,x0)+A1*poisson.pmf(x,x1)+A2*poisson.pmf(x,x2))

binwidth = 75
bin2 = 18
xmin = 0
xmax = bin2 * binwidth

counts, bins, bars = plt.hist( counts_77_2x2, bins=bin2, range=(xmin, 
xmax) )
#counts_77_2x2 is a dataset for the number of photons detected from atoms 
#trapped in an optical tweezer. The point of this fit is to determine the 
#probability of loading n atoms in a given optical tweezer, 77 is for tweezer 
#7, 2x2 is the box pixel size for the target tweezer site imaged by emccd


bins = np.rint( bins[1:] - 75 / 2 ).astype( np.int32 )

#for reference
#bins =[  38  112  188  262  338  412  488  562  638  712  788  862  938 
#1012 1088 1162 1238 1312]
#counts= [ 2.  0.  1.  2.  2.  5.  3. 10. 11.  8.  6.  4. 10.  8.  8.  9.  
#7.  3.]


## Initial Guess
p0 = [1,400, 9,650, 8,1050]

#Bounds for fit
bounds=([1,1, 5,525, 5,850], [6,524, 14,800, 14,1350])

coeff, var_matrix = curve_fit(poissonian, bins, counts, p0=p0, 
maxfev=1000000)

print( coeff )```

Using a user defined poisson function yields

....RuntimeWarning: overflow encountered in powerreturn (A0 * x0**x * np.exp(-x0) / scipy.special.factorial(x)....

I have read the first few pages of responses to similar issues, I replaced the factorial with the scipy.special.factorial as I read it would accommodate higher value inputs but it was unable to help.

Using the built in poisson.pmf probability mass function yields

...\anaconda3\lib\site-packages\scipy\optimize\minpack.py:828: OptimizeWarning: Covariance of the parameters could not be estimated warnings.warn('Covariance of the parameters could not be estimated',

To reduce the calculation overhead I scaled down the x values to increments of 1 (all bins are equally spaced so it can scaled up for real peak centers later) Using this I was finally able to get a working (albeit ugly) fit.

```python
import scipy
from scipy.stats import poisson
from scipy.special import factorial
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt
import numpy as np


def poissonian( x, A0, x0, A1, x1, A2, x2 ):
    return (A0 * x0**x * np.exp(-x0) / scipy.special.factorial(x)\
        +A1 * x1**x * np.exp(-x1) / scipy.special.factorial(x)\
        +A2 * x2**x * np.exp(-x2) / scipy.special.factorial(x))

binwidth = 75
bin2 = 18
xmin = 0
xmax = bin2 * binwidth

counts, bins, bars = plt.hist( counts_77_2x2, bins=bin2, range=(xmin, xmax) )

#for reference
#bins =[  38  112  188  262  338  412  488  562  638  712  788  862  938 1012 
#1088 1162 1238 1312]
#counts= [ 2.  0.  1.  2.  2.  5.  3. 10. 11.  8.  6.  4. 10.  8.  8.  9.  7.  
#3.]


### set bin separation to one to reduce calculation overhead (bins1)

bins1 = np.linspace( 1, 18, 18 )

## Scaled down guess values
p0 = [1,4, 8,10, 8,15]

#scaled down bounds
bounds=([0.1,1, 0.1,7, 0.1,13], [5,6, 12, 3, 12,18])

coeff, var_matrix = curve_fit(poissonian, bins1, counts, p0=p0, 
maxfev=1000000)

print( coeff )
```

Now that I got a scaled down version working, I wanted to know, does the code using the real data set have an error in it or is there a fundamental limitation for python's ability to fit poissonians for non small x valued data.

Thank you for your time!

Brian
  • 1
  • 1
  • 2
    Hi Brian, welcome to SO. Lots of code here, making it difficult to identify the problem. The shorter your [MCVE](https://stackoverflow.com/help/minimal-reproducible-example), the more likely you get good answers. Is your problem that the fit does not return plausible coefficients or that you cannot display the fit function? If the former, you may want to remove all parts that deal with the graphical representation. We also do not know what your input is - `counts_77_2x2` is not defined. – Mr. T Oct 20 '20 at 15:00
  • What I do not get is why you fit with `bins1`. How to you justify this remapping of your x-axis? – mikuszefski Oct 21 '20 at 11:57
  • Hi guys, thank you for suggestions, I hope I cleaned up the original post well enough to read. I was able to get a scaled down version working, I posted the errors the original data set yields above. mikuszefski, The real bins are all uniformly spaced by increments of 75, (I used np.rint on the original dataset to accommodate the poisson. I scaled down the x data to reduce calculation overhead to try and get a dummy version of the code working, also assuming I want to find the peak centers I am not sure why this would not scale back up? – Brian Oct 21 '20 at 17:51

0 Answers0