I have a Support Vector Machine that splits my data in two using a decision hyperplane (for visualisation purposes this is a sample dataset with three dimensions), like this:
Now I want to perform a change of basis, such that the hyperplane lies flatly on the x/y plane, such that the distance from each sample point to the decision hyperplane is simply their z-coordinate.
For that, I know that I need to perform a change of basis. The hyperplane of the SVM is given by their coefficient (3d-vector) and intercept (scalar), using (as far as I understand it) the general form for mathematical planes: ax+by+cz=d
, with a,b,c being the coordinates of the coefficient and d being the intercept. When plotted as 3d-Vector, the coefficient is a vector orthogonal to the plane (in the image it's the cyan line).
Now to the change of basis: If there was no intercept, I could just assume the vector that is the coefficient is one vector of my new basis, one other can be a random vector that is on the plane and the third one is simply cross product of both, resulting in three orthogonal vectors that can be the column vectors of the transformation-matrix.
The z-function used in the code below comes from simple term rearrangement from the general form of planes: ax+by+cz=d <=> z=(d-ax-by)/c
:
z_func = lambda interc, coef, x, y: (interc-coef[0]*x -coef[1]*y) / coef[2]
def generate_trafo_matrices(coefficient, z_func):
normalize = lambda vec: vec/np.linalg.norm(vec)
uvec2 = normalize(np.array([1, 0, z_func(1, 0)]))
uvec3 = normalize(np.cross(uvec1, uvec2))
back_trafo_matrix = np.array([uvec2, uvec3, coefficient]).T
#in other order such that its on the xy-plane instead of the yz-plane
trafo_matrix = np.linalg.inv(back_trafo_matrix)
return trafo_matrix, back_trafo_matrix
This transformation matrix would then be applied to all points, like this:
def _transform(self, points, inverse=False):
trafo_mat = self.inverse_trafo_mat if inverse else self.trafo_mat
points = np.array([trafo_mat.dot(point) for point in points])
return points
Now if the intercept would be zero, that would work perfectly and the plane would be flat on the xy-axis. However as soon as I have an intercept != zero, the plane is not flat anymore:
I understand that that is the case because this is not a simple change of basis, because the coordinate origin of my other basis is not at (0,0,0) but at a different place (the hyperplane could be crossing the coefficient-vector at any point), but my attempts of adding the intercept to the transformation all didn't lead to the correct result:
def _transform(self, points, inverse=False):
trafo_mat = self.inverse_trafo_mat if inverse else self.trafo_mat
intercept = self.intercept if inverse else -self.intercept
ursprung_translate = trafo_mat.dot(np.array([0,0,0])+trafo_matrix[:,0]*intercept)
points = np.array([point+trafo_matrix[:,0]*intercept for point in points])
points = np.array([trafo_mat.dot(point) for point in points])
points = np.array([point-ursprung_translate for point in points])
return points
is for example wrong. I am asking this on StackOverflow and not on the math StackExchange because I think I wouldn't be able to translate the respective math into code, I am glad I even got this far.
I have created a github gist with the code to do the transformation and create the plots at https://gist.github.com/cstenkamp/0fce4d662beb9e07f0878744c7214995, which can be launched using Binder under the link https://mybinder.org/v2/gist/jtpio/0fce4d662beb9e07f0878744c7214995/master?urlpath=lab%2Ftree%2Fchange_of_basis_with_translate.ipynb if somebody wants to play around with the code itself.
Any help is appreciated!