0

I want to make a 3D PCA plot with the first three principal components and to have a 95% confidence ellipsoid for each class (label). Here is my code:

% Input data
InputData = [7.72 6.73 3.33 0.12 0.06 -0.31 1;
             8.92 8.22 4.56 0.06 0.72 0.01 1;
             2.11 1.59 0.7 0.12 0.67 -0.02 1;
             79.42 92.88 75.03 64.98 67.8 55.93 2;
             100.82 112.5 94.7 73.16 74.22 64.15 2;
             69.43 83.4 67.46 80.78 81.36 69.38 2;
             25.25 23.86 17.95 0.12 0.44 0.14 3;
             39.97 36.9 30.06 0.13 0.82 0.35 3;
             31.99 29.48 23.68 -0.12 0.53 -0.1 3;
             129.93 126.54 113.51 1.09 0.17 0.38 4;
             129.91 127.4 114.6 1.32 1.05 1.19 4;
             110.38 108.7 96.3 0.41 1.25 0.87 4];

% Extract the features (first 6 columns) and labels (last column) from the data
X = InputData(:,1:6);
labels = InputData(:,7);

% Perform PCA on the data
[coeff,score,latent,~,explained] = pca(X);

% Get the first three principal components
PC1 = score(:,1);
PC2 = score(:,2);
PC3 = score(:,3);

% Calculate the means for each label
mean1 = mean(score(labels==1,1:3));
mean2 = mean(score(labels==2,1:3));
mean3 = mean(score(labels==3,1:3));
mean4 = mean(score(labels==4,1:3));

% Calculate the covariance matrices for each label
cov1 = cov(score(labels==1,1:3));
cov2 = cov(score(labels==2,1:3));
cov3 = cov(score(labels==3,1:3));
cov4 = cov(score(labels==4,1:3));

% Set the confidence level for the ellipsoids
confLevel = 0.95;

% Create a 3D scatter plot with different colors for each label
figure
scatter3(PC1(labels==1), PC2(labels==1), PC3(labels==1), 'r', 'filled')
hold on
scatter3(PC1(labels==2), PC2(labels==2), PC3(labels==2), 'g', 'filled')
scatter3(PC1(labels==3), PC2(labels==3), PC3(labels==3), 'b', 'filled')
scatter3(PC1(labels==4), PC2(labels==4), PC3(labels==4), 'm', 'filled')
title('3D Scatter Plot with 95% Confidence Ellipsoids')
xlabel('PC1')
ylabel('PC2')
zlabel('PC3')
legend('Label 1', 'Label 2', 'Label 3', 'Label 4')

% Create the ellipsoids for each label
drawEllipsoid(mean1, cov1, confLevel, 'r')
drawEllipsoid(mean2, cov2, confLevel, 'g')
drawEllipsoid(mean3, cov3, confLevel, 'b')
drawEllipsoid(mean4, cov4, confLevel, 'm')

% Function to draw ellipsoids
function drawEllipsoid(mean, covMatrix, confLevel, color)
    % Get the eigenvalues and eigenvectors of the covariance matrix
    [V,D] = eig(covMatrix);

    % Calculate the radii of the ellipsoid
    radii = sqrt(diag(D) * finv(confLevel, 3, size(mean,2)-1));

    % Generate points on a unit sphere
    [x,y,z] = sphere;

    % Transform the points using the eigenvectors and radii
    points = [x(:) y(:) z(:)] * diag(radii) * V';

    % Add the mean to the points
    points = points + mean;

    % Reshape the points into a matrix
    points = reshape(points, size(x,1), size(x,2), size(x,3));

    % Plot the ellipsoid
    h = surf(points(:,:,1), points(:,:,2), points(:,:,3));
    set(h, 'FaceColor', color, 'FaceAlpha', 0.1, 'EdgeAlpha', 0.1)
end

Matlab tells me there is an error in the drawEllipsoid function but I can't fix it or understand what is wrong. Error says:

Error using reshape Number of elements must not change. Use [] as one of the size inputs to automatically calculate the appropriate size for that dimension.

Error in test_3Dpcav2>drawEllipsoid (line 83) points = reshape(points, size(x,1), size(x,2), size(x,3));

Error in test_3Dpcav2 (line 60) drawEllipsoid(mean1, cov1, confLevel, 'r')

Can someone please explain and/or show how to fix it?

STEMQs
  • 75
  • 1
  • 10

1 Answers1

1

1.- fix for scatterplot error

the sizes of variables x y z and points did not fit into reshape as required.

Since you use surf, it makes sense to turn z into function of x and y, and then using reshape just for such z values, the 3rd column of variable points, leaving x and y as they are, using x and y directly into surf.

function drawEllipsoid(mn, covMatrix, confLevel, color,ax)
    % Get the eigenvalues and eigenvectors of the covariance matrix
    [V,D] = eig(covMatrix);

    % Calculate the radii of the ellipsoid
    radii = sqrt(diag(D) * finv(confLevel, 3, size(mn,2)-1));

    % Generate points on a unit sphere
    [x,y,z] = sphere;

    % Transform the points using the eigenvectors and radii
    pnts = [x(:) y(:) z(:)] * diag(radii) * V';

    % Add the mean to the points
    fxy = abs(pnts + mn);

    % Reshape the points into a matrix
    F1 = reshape(fxy(:,3), size(x,1), size(x,2));

    % Plot the ellipsoid
    
    h = surf(x,y,F1);
    set(h, 'FaceColor', color, 'FaceAlpha', 0.1, 'EdgeAlpha', 0.1)
end

enter image description here

This way drawEllipsoid takes off and doesn't return syntax errors, but it still doesn't fly where it is intended to.

Your function drawEllipsoids puts all ellipsoids right in the origin.

Zoom in on pile-up

enter image description here

2.- Scaling and shifting ellipsoids

The ellipsoids generated with the initial drawEllipsoid are all very small and piled up right on [0 0 0] because you do not scale and shift the outcome of sphere accordingly.

Command sphere always produces a radius 1 sphere centred on [0 0 0].

One has to scale such initial sphere into an ellipsoid, give it the right attitude (spatial orientation) and shift it to the right point, centred correctly so the ellipsoid comprises all or most of the a give set of points.

[-1 1] is very small in comparison to x y range [-100 100].

To this purpose I have defined R=20 and shifted each sphere with mn values.

But somewhere between the eigenvalues calculation [V,D] = eig(covMatrix); and radii = sqrt(diag(D) * finv(confLevel, 3, size(mn,2)-1)); complex values are produced, unsuitable to turn spheres into ellipsoids.

So just to get through I took abs so now you can see that the spheres are more or less scaled, centred, and it would be up to you to fix the complex values, as well as using a different radius for each cluster, as well as shaping the sphere into a truly ellipsoid with particular values in covMatrix.

enter image description here

The error you showed in the question is now solved and I have explained how to complete the clustering you are working on. It should be easy for you to take it from here.

John BG
  • 690
  • 4
  • 12
  • Thank you! I ended up modifying the `drawEllipsoid` function quite a bit based on your advice, but now I have 2D ellipses (seemingly on the right places), but I'm not sure why it is only projected on 2 axes, instead of 3. I ended up using `radii = sqrt(abs(diag(D)) * finv(confLevel, 3, size(mn,2)-1)); [x,y,z] = sphere(20); pnts = [x(:) y(:) z(:)] * V * diag(radii); pnts = bsxfun(@plus, pnts, mn); F1 = reshape(pnts(:,1), size(x,1), size(x,2)); F2 = reshape(pnts(:,2), size(x,1), size(x,2)); F3 = reshape(pnts(:,3), size(x,1), size(x,2));` & a mesh, instead of the `surf` function. Any thoughts? – STEMQs May 02 '23 at 17:27
  • you are right, I simplified to z=F(x,y) and when plotting all points ended up on a plane instead of 3D. – John BG May 03 '23 at 18:10
  • If you surf with all x y z you are going to place the points back to their respective 3D positions. Then you have to 1. reshape 2. orientate, and 3. translate such ellipsoids to the central point of each cluster. – John BG May 03 '23 at 18:11
  • The general equation for a 3D ellipsoid, any shape, any orientation, any position can be found for instance here https://en.wikipedia.org/wiki/Ellipsoid you need f1 f2 f3 for each cluster, to shape and aim the ellipsoid, and then x=x-f0(1);y=y-f0(2);z=z-f0(3); – John BG May 03 '23 at 18:14