5

I have a noisy signal and a model function, for example:

x=linspace(0,20);
w=[2 6 -4 5];
y=w(1)*besselj(0,x)+w(2)*besselj(1,x)+w(3)*besselj(2,x)+w(4)*besselj(3,x);
y(randi(length(y),[1 10]))=10*rand(1,10)-5;
plot(x,y,'x')

enter image description here

I thought to use RANSAC to find the w in my model, as this method is robust to noise when finding lines. However, this is not a linear problem, and I couldn't get a proper fit, probably because of the oscillatory nature of the function I'm trying to fit.

I saw matlab has a fitPolynomialRansac function, but even this fails for a a+b*x+c*x^2+d*x^3 simple case (between -1 and 1).

Any ideas how to tame RANSAC? or of a different robust to noise approach?

bla
  • 25,846
  • 10
  • 70
  • 101
  • 1
    Why not use `lsqnonlin`? – rinkert Oct 13 '20 at 14:05
  • 3
    ...e.g. with a [`soft_l1` cost function](https://scipy-cookbook.readthedocs.io/items/robust_regression.html) – mikuszefski Oct 14 '20 at 10:44
  • 1
    @mikuszefski that sounds interesting indeed, I wonder if there's a built in matlab capability of that ... – bla Oct 14 '20 at 16:53
  • 1
    Well, I think that is not necessary. I do not use Matlab, but as far as I understand it, looking at the docs of [lsqnonlin](https://de.mathworks.com/help/optim/ug/lsqnonlin.html) it makes a least square minimization of a cost function that you provide. Typically you provide `f(x_i, params) = theory( x_i, params ) - y_i` which will, by squaring, result in the standard least square. I f instead you let it minimize 'f( x_i, params) = sqrt( 2 * sqrt( 1 + r**2 ) - 1 )', where `r = theory( x_i, params ) - y_i` you get exactly what is described in the link from above. Let me know if I am right/wrong. – mikuszefski Oct 15 '20 at 10:18
  • added a bounty, to give you incentive to demonstrate your solution. – bla Oct 29 '20 at 00:29
  • Why do you say this is not a linear problem? It is linear in `w`. Good old linear least squares `w_est = [besselj(0,x); besselj(1,x); besselj(2,x); besselj(3,x)].'\y.'` seems to be too sensitive to noise, though – Luis Mendo Oct 31 '20 at 00:46
  • I meant linear for RANSAC as usually ransac is used for linear (line fitting) problems. RANSAC is pretty robust to noise, that's why I ask about it in this context... – bla Oct 31 '20 at 05:40

3 Answers3

4

This is just implementing @mikuszefski's comment to use a soft L1 loss function. It does appear to be more resistant to noise:

x = linspace(0,20);

% model function
yFun=@(w) w(1)*besselj(0,x)+w(2)*besselj(1,x)+w(3)*besselj(2,x)+w(4)*besselj(3,x);

% generate training data
N = 20; % number of noisy elements
w=[2 6 -4 5]; % true paramater values
y = yFun(w); % evaluate true model function
y(randi(length(y),[1 N])) = 10*rand(1,N)-5; % add noise

% standard loss function for least sqaure
d = @(w) yFun(w)-y;

v1 = lsqnonlin(d,[1 1 1 1]); % normal least squares
v2 = lsqnonlin(@(w) sqrt(2*(sqrt(1+d(w).^2)-1)),[1 1 1 1]) % soft L1 loss
David
  • 8,449
  • 1
  • 22
  • 32
  • thank you for the answer. this solution gives somewhat an ok fit, but not robust. I have figured out that replacing the cost (or loss) function to `sqrt(2*(sqrt(1+abs(d(w)))-1))` does a lot better. I think it is because it is really L1 and not a soft L1... so you were close, but helpful enough. I didnt test this to many types of noise, and I hope to see other answers that use a RANSAC appriach. – bla Oct 29 '20 at 03:58
  • (forgive my typos :) ) – bla Oct 29 '20 at 04:27
3

I have modified the cost function testing a hard(er) L1 assumption and got a much more robust fit vs the answer of David (I was testing with a higher # of noisy elements, see my addition for v3) :

x = linspace(0,20);

% model function
yFun=@(w) w(1)*besselj(0,x)+w(2)*besselj(1,x)+w(3)*besselj(2,x)+w(4)*besselj(3,x);

% generate training data
N = 50; % number of noisy elements
w=[2 6 -4 5]; % true paramater values
y = yFun(w); % evaluate true model function
y(randi(length(y),[1 N])) = 10*rand(1,N)-5; % add noise

% standard loss function for least sqaure
d = @(w) yFun(w)-y;

v1 = lsqnonlin(d,[1 1 1 1]); % normal least squares
v2 = lsqnonlin(@(w) sqrt(2*(sqrt(1+d(w).^2)-1)),[1 1 1 1]) % soft L1 loss
v3 = lsqnonlin(@(w) sqrt(2*(sqrt(1+abs(d(w)))-1)) ,[1 1 1 1]) %  the new cost function


plot(x,y,'x',x,yFun(v1),x,yFun(v2),x,yFun(v3))
legend('data','v1','v2','v3')

enter image description here

bla
  • 25,846
  • 10
  • 70
  • 101
  • Well, just to clarify, that's not really a hard L1. `sqrt(1 + x ) = 1 + x/2 + O(x^2)`. So yes, for small `x` this is `abs( x )`. For larger `x` it transitions into `sqrt( abs(x) )`, i.e. decreasing the weight of outliers even further. A pure L1 would just be ` sqrt( abs( x ) )` – mikuszefski Oct 29 '20 at 06:39
  • yes, i should have said "harder" or "less soft"... so how would you name `sqrt(2*(sqrt(1+abs(d(w)))-1))` ? – bla Oct 29 '20 at 07:20
2

The ransac-implementation in MATLAB seems to work natively only with predefined fitting functions.

However, you could create a workaround by wraping a fitting function in a function. The function should be part of the *Computer Vision Toolbox" or the Automated Driving System Toolbox.

x = linspace(0,20).';
w = [2 6 -4 5];
y = w(1)*besselj(0,x)+w(2)*besselj(1,x)+w(3)*besselj(2,x)+w(4)*besselj(3,x);
y(randi(length(y),[1 10])) = 10*rand(1,10)-5;
plot(x,y,'x')



sampleSize = 5; % number of points to sample per trial
maxDistance = 2; % max allowable distance for inliers

% creating the function you want to obtain
bFnc = @(x,w) w(1)*besselj(0,x)+w(2)*besselj(1,x)+w(3)*besselj(2,x)+w(4)*besselj(3,x);

% wrapping this function into a cost function with two inputs (the points
% and the "model"
cstFmin = @(xy,w) sum((xy(:,2) - bFnc(xy(:,1),w)).^2);

% creating a function handle that fits the function + returns a single
% object as a model
fitFnc = @(xy) {fminsearch(@(w)cstFmin(xy,w),ones(4,1))}; % returns the "model" as a cell

% build a function that determines a distance measure
evlFnc = @(model,xy)(( xy(:,2) - bFnc(xy(:,1),model{1}) ).^2);

% call RANSAC
[modelRANSAC, inlierIdx] = ransac([x,y],fitFnc,evlFnc, sampleSize,maxDistance);

% plot results
% plot
xIN = x(inlierIdx);
yIN = bFnc(xIN,modelRANSAC);

hold on
plot(xIN,yIN,'r-')
hold off
title(num2str(sampleSize,'sampleSize = %d'))

Note that the fminsearch-algorithm always starts at ones(4,1).. You could also integrate a different optimization algorithm here. plot of the results

max
  • 3,915
  • 2
  • 9
  • 25
  • I think `ransacked` is in the Computer Vision Toolbox. I tried running your code but it didn't work, and I couldn't fix it (I didn't have much time to try). Good to know that there is a `ransac` somewhere. – David Nov 05 '20 at 22:39
  • 1
    @David I tested the code somewhere else and fixed minor bugs. Sorry for the trouble. It now works (see the graph ;) ). You can play around with the optimization function (`fminsearch` for now) and especially with the number of points from which `ransac` calibrates a function => `sampleSize` – max Nov 06 '20 at 08:15