Python circle fitting to data points less sensitive to random noise

Question

I have a set of measured radii (t+epsilon+error) at an equally spaced angles. The model is circle of radius (R) with center at (r, Alpha) with added small noise and some random error values which are much bigger than noise.

The problem is to find the center of the circle model (r,Alpha) and the radius of the circle (R). But it should not be too much sensitive to random error (in below data points at 7 and 14).

Some radii could be missing therefore the simple mean would not work here.

I tried least square optimization but it significantly reacts on error.

Is there a way to optimize least deltas but not the least squares of delta in Python?

Model:
n=36
R=100
r=10
Alpha=2*Pi/6

Data points:
[95.85, 92.66, 94.14, 90.56, 88.08, 87.63, 88.12, 152.92, 90.75, 90.73, 93.93, 92.66, 92.67, 97.24, 65.40, 97.67, 103.66, 104.43, 105.25, 106.17, 105.01, 108.52, 109.33, 108.17, 107.10, 106.93, 111.25, 109.99, 107.23, 107.18, 108.30, 101.81, 99.47, 97.97, 96.05, 95.29]

@SlaterTyranus yes, I want them not to have effect on my fitted circle — Oscar, Feb 16 '14 at 07:52

Slater Victoroff · Answer 1 · 2014-02-16T08:58:02.630

It seems like your main problem here is going to be removing outliers. There are a couple of ways to do this, but for your application, your best bet is to probably just to remove items based on their distance from the median (Since the median is much less sensitive to outliers than the mean.)

If you're using numpy that would looks like this:

def remove_outliers(data_points, margin=1.5):
    nd = np.abs(data_points - np.median(data_points))
    s = nd/np.median(nd)
    return data_points[s<margin]

After which you should run least squares.

If you're not using numpy you can do something similar with native python lists:

def median(points):
    return sorted(points)[len(points)/2] # evaluates to an int in python2

def remove_outliers(data_points, margin=1.5):
    m = median(data_points)
    centered_points = [abs(point - m) for point in data_points]
    centered_median = median(centered_points)
    ratios = [datum/centered_median for datum in centered_points]
    return [point for i, point in enumerate(data_points) if ratios[i]>margin]

If you're looking to just not count outliers as highly you can just calculate the mean of your dataset, which is just a linear equivalent of the least-squares optimization.

If you're looking for something a little better I might suggest throwing your data through some kind of low pass filter, but I don't think that's really needed here.

A low-pass filter would probably be the best, which you can do as follows: (Note, alpha is a number you will have to fiddle with to get your desired output.)

def low_pass(data, alpha):
    new_data = [data[0]]
    for i in range(1, len(data)):
        new_data.append(alpha * data[i] + (1 - alpha) * new_data[i-1])
    return new_data

At which point your least squares optimization should work fine.

Thanks for reply. The problem is that the size of outliers is not defined, therefore I'm looking for the code that will work like least square optimize but not for squares, to minimize outlires contribution to overall result. — Oscar, Feb 16 '14 at 08:06
@Oscar The above code does not assume any size for the outliers, just a deviation from the standard. — Slater Victoroff, Feb 16 '14 at 08:08
The first code "gives only integer arrays with one element can be converted to an index". The Second removes 83% of data points. Is there a way to optimize least-deltas not the squares? — Oscar, Feb 16 '14 at 08:24
Updated the problem. Some radii could be missing then mean would not work. I was think about Fourier transform and look on radii as sine wave but this is difficult for my novice level. — Oscar, Feb 16 '14 at 08:40
@Oscar You can change the exact cutoff to fit your problem, but I can post an example of the low-pass filter approach. A Fourier transform would not make sense here. — Slater Victoroff, Feb 16 '14 at 08:51

score 2 · Accepted Answer · answered Feb 17 '14 at 01:54

Replying to your final question

Is there a way to optimize least deltas but not the least squares of delta in Python?

Yes, pick an optimization method (for example downhill simplex implemented in scipy.optimize.fmin) and use the sum of absolute deviations as a merit function. Your dataset is small, I suppose that any general purpose optimization method will converge quickly. (In case of non-linear least squares fitting it is also possible to use general purpose optimization algorithm, but it's more common to use the Levenberg-Marquardt algorithm which minimizes sums of squares.)

If you are interested when minimizing absolute deviations instead of squares has theoretical justification see Numerical Recipes, chapter Robust Estimation.

From practical side, the sum of absolute deviations may not have unique minimum. In the trivial case of two points, say, (0,5) and (1,9) and constant function y=a, any value of a between 5 and 9 gives the same sum (4). There is no such problem when deviations are squared.

If minimizing absolute deviations would not work, you may consider heuristic procedure to identify and remove outliers. Such as RANSAC or ROUT.

Python circle fitting to data points less sensitive to random noise

2 Answers2