11

after fitting my data into X = my data

pca = PCA(n_components=1)
pca.fit(X)
X_pca = pca.fit_transform(X)

now X_pca has one dimension.

When I perform inverse transformation by definition isn't it supposed to return to original data, that is X, 2-D array?

when I do

X_ori = pca.inverse_transform(X_pca)

I get same dimension however different numbers.

Also if I plot both X and X_ori they are different.

num3ri
  • 822
  • 16
  • 20
haneulkim
  • 4,406
  • 9
  • 38
  • 80

2 Answers2

15

When I perform inverse transformation by definition isn't it supposed to return to original data

No, you can only expect this if the number of components you specify is the same as the dimensionality of the input data. For any n_components less than this, you will get different numbers than the original dataset after applying the inverse PCA transformation: the following diagrams give an illustration in two dimensions.

enter image description here

butterflyknife
  • 1,438
  • 8
  • 17
6

It can not do that, since by reducing the dimensions with PCA, you've lost information (check pca.explained_variance_ratio_ for the % of information you still have). However, it tries its best to go back to the original space as well as it can, see the picture below Comparison of original points with transformed points where information is lost

(generated with

import numpy as np
from sklearn.decomposition import PCA
pca = PCA(1)
X_orig = np.random.rand(10, 2)
X_re_orig = pca.inverse_transform(pca.fit_transform(X_orig))

plt.scatter(X_orig[:, 0], X_orig[:, 1], label='Original points')
plt.scatter(X_re_orig[:, 0], X_re_orig[:, 1], label='InverseTransform')
[plt.plot([X_orig[i, 0], X_re_orig[i, 0]], [X_orig[i, 1], X_re_orig[i, 1]]) for i in range(10)]
plt.legend()
plt.show()

) If you had kept the n_dimensions the same (set pca = PCA(2), you do recover the original points (the new points are on top of the original ones):
New points on top of original points

Jondiedoop
  • 3,303
  • 9
  • 24
  • 1
    once information has been lost how does it try to go back to 2-D? Also then why do we even use inverse_transform? – haneulkim Apr 05 '19 at 10:34