0

I have an array of 50 samples of x,y,z-coordinates. These samples represent a linear movement of an arm. I want to do the following:

  1. Find the best fitting line in the 3d-coordinate system for 50 samples.
  2. Based on the new calculated line, check if a new point (sample 51) has a distance to the line below a certain threshold.

So far, I tried to use PCA to calculate the best fitting line, but I have some trouble to understand it. I have to do it by hand. First I generated an array of 50 samples.

import random
import numpy as np

x, y, z = np.mgrid[-30:-10:50j], np.mgrid[-50:-30:50j], np.mgrid[0:-10:50j]
points = np.concatenate((x[:, np.newaxis], 
                       y[:, np.newaxis], 
                       z[:, np.newaxis]), 
                      axis=1)
points += np.random.normal(size=points.shape) * 0.8
print(np.shape(points))

This resulted in the following picture. 3D-Plot of 50 samples

Afterwards I have to calculate the cov-matrix and the eigenvectors/eigenvalues for the PCA.

# Mean of every column. 
x, y, z = points[:,0], points[:,1], points[:,2]
mean = np.array((sum(x)/len(x), sum(y)/len(y), sum(z)/len(z)))

# Cov-Matrix
N, M = np.shape(points)
cov = np.zeros((M,M))

for i in range(M):
    for j in range(M):
        summe = 0
        for k in range(N):
            summe += (points[k][i] - mean[i]) * (points[k][j] - mean[j]) / (N-1)
        cov[i, j] = summe

# eigenvectors and eigenvalues. Safe first principle component to variable "eig"
eigval, eigvec = np.linalg.eig(cov)
eig = eigvec[:,np.argmax(eigval)]

The eigenvector corresponding to the highest eigenvalue is our first component. To compare the results, I used sklearn to test whether my values are right or not.

Compared to PCA my values for the direction of the vector are the same:

eig:     [-0.67634469 -0.66399367  0.31885775]
sklearn: [ 0.67634469  0.66399367 -0.31885775]

And now I do not have any clue, how I can use this direction vector to calculate a distance to a new point. Furthermore, if I try to plot it, the points and the cloud are off due to an offset.

# Plot points in 3D
def plot_line(arr, line):
    fig = plt.figure()
    ax = plt.axes(projection='3d')
    x, y, z = arr[:,0], arr[:,1], arr[:,2]
    ax.scatter3D(x, y, z)
    
    segments = np.arange(-20, 20)[:, np.newaxis] * line
    lineplot = ax.plot(*(segments).T, color="red")
    lineplot = ax.plot(*(line).T, 'ro')

3D-Coordinate System with points and off direction vector

How to I get the point that line passes through, when I have the slope? Is there a better/faster alternative to get the line or is PCA the best way to do it?

Thank your for your help!

  • So, your question is simply "line to point distance calculation way"? – fana May 17 '23 at 02:02
  • The line from 50 points will path through the mean position of 50 points. – fana May 17 '23 at 02:30
  • Thanks, i do not know why i struggled to find the offset. Is there any better way to find such a line in 3D? – Vorsten May 17 '23 at 08:52
  • It will depend on the meaning of "better". But when think about robustness (avoiding bad effect from outliers), for example M-estimator or RANSAC will be able to be better way? – fana May 17 '23 at 09:49

0 Answers0