0

I am batch processing 1000s of data. Sometime the peak positions and magnitudes change drastically, and the program struggles to find these peaks with a single start point value. I have to divide my data into smaller batches to change the start point values, which is time consuming.

Is it possible to try various start point values and select the one with the best rsquare?

ft = fittype('y0 + a*exp(-((x-xa)/(wa))^2), 'independent', 'x', 'dependent', 'y' );
opts = fitoptions( 'Method', 'NonlinearLeastSquares' );
opts.Display = 'Off';

opts.StartPoint = [10 10 10 0]; % this is a, wa, xa and y0 - from the equation

[fitresult, gof] = fit(xData, yData, ft, opts);

alpha = gof.rsquare; % extract goodness of fit

if alpha < 0.98 % if rsquare (goodness of fit) is not good enough
    
for x = 100:10:500; y= 10:1:50 %these numbers are not set in stone - can be any number
    
opts.StartPoint = [10+x 10 10+y 0]; % tweak the start point values for the fit

[fitresult, gof] = fit(xData, yData, ft, opts); % fit again

Then select the start point with the best rsquare and plot the results.

% plot
f = figure('Name', 'Gauss','Pointer','crosshair');
h = plot(fitresult, xData, yData, '-o'); 
Mosawi
  • 197
  • 2
  • 16
  • 2
    If I'm not mistaken, you're describing the basin hopping approach. I don't think you need that for something as robust as a Gaussian, although you can look up Jean Jacquelin's work for a couple of good methods on getting really accurate initial parameters without iteration. – Mad Physicist Apr 21 '21 at 21:51
  • @MadPhysicist Thanks for your suggestion, I will read the suggested reference. I am batch processing 1000s of data. Sometime the position and magnitude for the peaks change drastically, and the program struggles to find these peaks with a single start point value. I have to divide my data into smaller batches to change the start point values, which is time consuming. – Mosawi Apr 21 '21 at 21:57
  • You need a good guessing approach. I've started working on a thing called scikit-guess, which you may find useful as a reference. – Mad Physicist Apr 21 '21 at 21:57

1 Answers1

1

If they are difficulties in guessing, I suggest to use a different method which is not iterative and doesn't need guessed value of the parameters to start the numerical calculus.

Since I have no representative data of your problem, I cannot check if the method proposed below is convenient in your case. This depends on the scatter of the data and on the distribution of the points.

Try it and see. If the result is not correct, please let me know.

A numerical example with highly scattered data is shown below. With this example you can check if the method is correctly implemented.

enter image description here

enter image description here

NOTE : This method can be used to obtain some approximate values of the parameters which can be put as "guessed" values in the usual non-linear regression softwares.

For information : The method is a linear regression wrt an integral equation to which the Gaussian function is solution :

enter image description here

For the general principle, see : https://fr.scribd.com/doc/14674814/Regressions-et-equations-integrales

JJacquelin
  • 1,529
  • 1
  • 9
  • 11