RANSAC
The algorithm
Until an inlier percentage threshold is reached or N sample combinations are tested.
- It randomly selects the smallest sample as possible to build or fit a model.
- The other data points are classified as inliers or outliers
- The model is accepted or rejected
Inputs:
- error tolerance for determining inliers and outliers
- Threshold inlier percentage
- Maximum sample combinations tested
Possible improvements
- Make sure that no combination is tested more than once
- If a better way is possible to select combinations, use that.
- Once a lot of inliers are found, use a new set of inliers for the further search
Source:
Fischler and Bolles - Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography
Your aplication
Your model is a sine defined as f(x) = amplitude * sin(period * x) + bias. Fitting this model will not be as easy as it is dependent on three parameters. I think that it will risk in long runs and a possibility of overfitting. A possible solution might be to run the algorithm multiple times for different periods and keep the bias and amplitude fixed.
iterationThreshold = 10000;
iterationCount = 0
errorthreshold = 0.05;
while(numel(inliers(:,1)) > inlierThreshold)
samples = extractMinimumSamples(points);
[sineX, sineY] = fitSine(samples);
inliers = determineInliers(points, SineX, SineY)
iterationCount = iterationCount + 1;
if(iterationCount => iterationThreshold)
break;
end
end
See also the possible improvements for modifications to this code