I am currently processing experimental data for my thesis and am running into a problem with scipy curve_fit.
Background
This is a study of LED emission with the following model depicting the absorption spectra for a specific LED composition/wavelength.
The model is this:
The basic idea is, we got experimental data and we want to fit this equation to give us a best guess of a vertical shift in the data that is a result of the equipment used in the experiment. And to get that vertical shift, the function to be used in the curve_fit
would take the form of a + c * E * np.sqrt(E-bandE) * np.exp(-E*b)
. bandE/Eg refers to the bandgap energy of the material which will be provided in the code section. E refers to the photon energy.
What I did
The values I am using in a pandas dataframe that I kept as a list for you to copy and paste (if you want it),
photon_energy = [1.1271378805005456, 1.1169834851807208, 1.1070104183487501, 1.0972138659739825, 1.0875891829391229, 1.0781318856961741, 1.0688376453022415, 1.0597022808124787, 1.0507217530089832, 1.0418921584458825, 1.0332097237921667, 1.0246708004550413, 1.016271859467705, 1.0080094866265041, 0.9998803778633872, 0.9918813348404801, 0.9840092607544446, 0.9762611563390552, 0.9686341160551564, 0.9611253244578295, 0.9537320527312309, 0.9464516553821375, 0.939281567083788, 0.9322192996621053, 0.9252624392168658, 0.918408643370815, 0.9116556386401471, 0.9050012179201461, 0.898443238080145, 0.8919796176623023, 0.885608334679, 0.8793274245039717, 0.8731349778525352, 0.8670291388465735, 0.8610081031601389, 0.8550701162417932, 0.8492134716100002, 0.8434365092180953, 0.8377376138855407, 0.8321152137923491, 0.8265677790337335]
s2c = 1.0711371944297785, 1.0231329828975677, 1.0994106908895496, 1.5121380434280387, 1.4362625879245816, 1.6793735384201034, 1.967376254925342, 2.718958670464331, 2.8657461347457933, 3.2265806746948247, 4.073118384895329, 5.002080377098846, 5.518310980392261, 6.779117609004787, 7.923629188601875, 9.543272102194026, 11.061716095291905, 12.837722885549315, 15.156654004011116, 17.604461138085984, 20.853321055852934, 24.79640344112394, 28.59835938028905, 32.5257456, 37.87676923906976, 42.15321400245093, 46.794297771521705, 56.44267690099888, 61.60473904566305, 70.99822229568558, 77.60736232076566, 84.37513036736146, 92.9038746946938, 107.54475674330527, 117.91910226690293, 137.67481655050688, 158.02001455302846, 176.37334256204952, 195.20886164268876, 215.87011902349641, 240.41535423461914]
The fit
bandE = 0.7435616030790153
def exp_fit(E, a, b, c):
# return a + c * E * np.sqrt(E - bandE) * np.exp(-E/0.046)# Eg and k are already defined previously
return a + c * E * np.sqrt(E-bandE) * np.exp(-E*b)
E = np.linspace(np.min(new_df['Photon Energy']), np.max(new_df['Photon Energy']),1000)
popt, pcov = curve_fit(exp_fit, new_df['Photon Energy'], new_df['S2c'],maxfev = 10000, p0=[0,500/23,1e+9]) # best guess of a,b, and c value
plt.plot(new_df['Photon Energy'], new_df['S2c'], 'o', label='S2c')
plt.plot(new_df['Photon Energy'], exp_fit(new_df['Photon Energy'], *popt), '-', label='S2c fit')
plt.ylabel('Emission Intensity (a.u.)')
plt.xlabel('Photon Energy (eV)')
plt.yscale('log')
plt.legend()
plt.show()
And this is what we end up getting.
out: [1.59739310e+00 2.50268369e+01 9.55186101e+11]
So after a long discussion with the person I am working with (we aren't that knowledgeable about python or data science), we agree that everything except for the a
coefficient fits really well (b doesnt really matter because it will be explicitly calculated at a later step. C matters alot and it appears to be of the right order of magnitude). Because it is a vertical shift, we expect a
to be a constant but the curve is diverging as a result of it.
The problem
As mentioned in the question title and the previous para, we are expecting a
to be about 5e-4
or within that range of magnitude but we are getting something that is way too large for this experiment. If anyone is proficient with the curve_fit feature of scipy, do help us out!
Additional info, we used to use something called OriginLab (a more expensive microsoft excel), but it is hella expensive for the license, so we are trying to use python instead. This method does work on OriginLab and does not result in a divergence in the fit, so we figured it might have something to do with the algorithm that curve_fit uses.