Fitting an ellipse through orbital data

Question

I've generated a bunch of data for the (x,y,z) coordinates of a planet as it orbits around the Sun. Now I want to fit an ellipse through this data.

What I tried to do:

I created a dummy ellipse based on five parameters: The semi-major axis & eccentricity that defines the size & shape and the three euler angles that rotate the ellipse around. Since my data is not always centered at origin I also need to translate the ellipse requiring additional three variables (dx,dy,dz). Once I initialise this function with these eight variables I get back N number of points that lie on this ellipse. (N = number of data points I am plotting the ellipse through) I calculate the deviation of these dummy points from the actual data and then I minimise this deviation using some minimisation method to find the best fitting values for these eight variables.

My problem is with the very last part: minimising the deviation and finding the variables' values.

To minimise the deviation I use scipy.optimize.minimize to try and approximate the best fitting variables but it just doesn't do good enough of a job:

Here is an image of what one of my best fits looks like and that's with a very generously accurate initial guess. (blue = data, red = fit)

Here is the entire code. (No data required, it generates its own phony data)

In short, I use this scipy function:

initial_guess = [0.3,0.2,0.1,0.7,3,0.0,-0.1,0.0]
bnds = ((0.2, 0.5), (0.1, 0.3), (0, 2*np.pi), (0, 2*np.pi), (0, 2*np.pi), (-0.5,0.5), (-0.5,0.5), (-0.3,0.3)) #reasonable bounds for the variables
result = optimize.minimize(deviation, initial_guess, args=(data,), method='L-BFGS-B', bounds=bnds, tol=1e-8) #perform minimalisation
semi_major,eccentricity,inclination,periapsis,longitude,dx,dy,dz = result["x"]

To minimize this error (or deviation) function:

    def deviation(variables, data):
    """
    This function calculates the cumulative seperation between the ellipse fit points and data points and returns it
    """
    num_pts = len(data[:,0])
    semi_major,eccentricity,inclination,periapsis,longitude,dx,dy,dz = variables
    dummy_ellipse = generate_ellipse(num_pts,semi_major,eccentricity,inclination,periapsis,longitude,dz,dy,dz)
    deviations = np.zeros(len(data[:,0]))
    pair_deviations = np.zeros(len(data[:,0]))
    # Calculate separation between each pair of points
    for j in range(len(data[:,0])):
        for i in range(len(data[:,0])):
            pair_deviations[i] = np.sqrt((data[j,0]-dummy_ellipse[i,0])**2 + (data[j,1]-dummy_ellipse[i,1])**2 + (data[j,2]-dummy_ellipse[i,2])**2)
            deviations[j] = min(pair_deviations) # only pick the closest point to the data point j.
    total_deviation = sum(deviations)
    return total_deviation

(My code may be a bit messy & inefficient, I'm new to this)

I may be making some logical error in my coding but I think it comes down to the scipy.minimize.optimize function. I don't know enough about how it works and what to expect of it. I was also recommended to try Markov chain Monte Carlo when dealing with this many variables. I did take a look at the emcee, but it's a little above my head right now.

Disclaimer: I've not read through all your code. How bad **really** is that fit? Just because it looks a bit off doesn't necessarily convey that it's a really bad fit. When I ran your code, all fitted parameters were correct to at most +-0.05 (inclination), the rest were even "preciser" with about +-0.02. Would creating sort of a pipeline where you'd only fit a couple of parameters at a time help increase precision? (i.e. make assumption that all your angles are zero and fit only `e` and `a`, then take that ellipse and fit angles to it based on original data, maybe with a different method etc) — ljetibo, Mar 10 '15 at 13:03
Just an observation: since your data presumably lie in a plane, it might be an idea to define your ellipse in this plane before you fit it rather than attempting to fit the Euler angles. — xnx, Mar 10 '15 at 13:31
The link in the post is broken, so the code in the questions/answers is insufficient to recreate an example ... — , Dec 15 '19 at 10:12

pv. · Accepted Answer · 2015-03-10T20:01:35.417

First, you have a typo in your objective function that prevents optimization of one of the variables:

dummy_ellipse = generate_ellipse(...,dz,dy,dz)

should be

dummy_ellipse = generate_ellipse(...,dx,dy,dz)

Also, taking sqrt out and minimizing the sum of squared euclidean distances makes it numerically somewhat easier for the optimizer.

Your objective function is also not everywhere differentiable because of the min(), as assumed by the BFGS solver, so its performance will be suboptimal.

Also, approaching the problem from analytical geometry perspective may help: an ellipse in 3d is defined as a solution of two equations

f1(x,y,z,p) = 0
f2(x,y,z,p) = 0

Where p are the parameters of the ellipse. Now, to fit the parameters to a data set, you could try to minimize

F(p) = sum_{j=1}^N [f1(x_j,y_j,z_j,p)**2 + f2(x_j,y_j,z_j,p)**2]

where the sum goes over data points.

Even better, in this problem formulation you could use optimize.leastsq, which may be more efficient in least squares problems.

Of course there had to be a typo. -.- Very helpful! Thank you! — SubTachyon, Mar 11 '15 at 20:49

Fitting an ellipse through orbital data

1 Answers1

Linked