Questions tagged [dimensionality-reduction]

In machine learning and statistics, dimensionality reduction or dimension reduction is the process of reducing the number of random variables under consideration, and can be divided into feature selection and feature extraction.

422 questions
2
votes
1 answer

Why does PCA result change drastically with a small change in the input?

I am using PCA to reduce an Nx3 array to an Nx2 array. This is mainly because the PCA transformation (Nx2 matrix) is invariant to the rotations or translations performed on the original Nx3 array. Let's take the following as an example. import numpy…
Achintha Ihalage
  • 2,310
  • 4
  • 20
  • 33
2
votes
1 answer

Optimal perplexity for t-SNE with using larger datasets (>300k data points)

I am using t-SNE to make a 2D projection for visualization from a higher dimensional dataset (in this case 30-dims) and I have a question about the perplexity hyperparameter. It's been a while since I used t-SNE and had previously only used it on…
2
votes
0 answers

How to generalize umap parameters

The main parameters I am using to create the umap are min_dist, a and b. I have set min_dist=0.5, a=1, b=1 which is giving a meaningful low-dimensional representation for most datasets initially, when all the features are used (approx 10K to 30K…
rj dj
  • 260
  • 1
  • 5
  • 22
2
votes
0 answers

Dimensionality reduction using LDA for wavelet scalogram in python

I am trying to reduce dimensionality of multiple scalograms having same dimension of size[5x3844]. How can I apply that using LDA in python ?? Any help would be appreciated. code: def wavelet(data): fs=256 lowcut=117 highcut=123 …
2
votes
1 answer

Coding Isomap (& MDS) function using only numpy and scipy in python

I have coded Isomap function starting with computing the eulidean distance matrix (using scipy.spatial.distance.cdist), next basing on K-nearest neighbors method and Dijkstra algorithm (to determinate the shortest path) I have Computed the full…
2
votes
1 answer

TensorFlow Embedding Projector for Visualization of Latent Space Images Not Working?

Can someone please help me with why Tensorflow embedding projector is not working? I am training an autoencoder and am now trying to visualize the latent space. I followed this very useful tutorial:…
2
votes
0 answers

How to measure distance when applying MDS

Hi I have a very specific, weird question about applying MDS with Python. When creating a distance matrix of the original high dimensional dataset (let’s call it distanceHD) you can either measure those distances between all data points with…
2
votes
0 answers

How to get projection of data in quadratic discriminant analysis

For dimensionality reduction : In Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA), we can visualize the data projected on new reduced dimensions by doing a dot product of the data with the eigenvectors. How to do this for…
rj dj
  • 260
  • 1
  • 5
  • 22
2
votes
2 answers

When should we use Principal Component Analysis?

In machine learning, more features or dimensions can decrease a model’s accuracy since there is more data that needs to be generalized and this is known as the curse of dimensionality. Dimensionality reduction is a way to reduce the complexity of a…
Sachin Rastogi
  • 409
  • 5
  • 8
2
votes
0 answers

How to reduce position changes after dimensionality reduction?

Disclaimer: I'm a machine learning beginner. I'm working on visualizing high dimensional data (text as tdidf vectors) into the 2D-space. My goal is to label/modify those data points and recomputing their positions after the modification and updating…
2
votes
2 answers

principal components of PCA

I came across this question in datacamp.com: Bellow are three scatter plots of the same point cloud. Each scatter plot shows a different set of axes (in red). In which of the plots could the axes represent the principal components of the point…
2
votes
1 answer

bin and transpose in R

I am still getting the hang of R and coding in general, so bear with me on this. my problem This is a dimension reduction idea I have consisting of three steps. I need help with the first two. bin rows transpose the binned rows into new columns so…
santma
  • 271
  • 1
  • 11
2
votes
0 answers

Clustering with Autoencoder

I made a model for clustering and it's encoded dimension is about 3000. To check if the autoencoder is well established, I draw a 2d_pca plot and 3d_pca and the plots look nice. My question is that, what is general way to cluster with this encoded…
Gwan
  • 31
  • 4
2
votes
2 answers

fourier transformation as a dimensional reduction technique in python

My dataset has 2000 attributes and 200 samples. I need to reduce the dimensionality of it. To do this, I am trying to use Fourier transformation as a dimensional reduction. Fourier transformation returns the discrete Fourier transform when I feed…
user3104352
  • 1,100
  • 1
  • 16
  • 34
2
votes
1 answer

What does mean affinity='precomputed' in Feature Agglomeration dimensionality reduction?

What does affinity='precomputed' mean in feature agglomeration dimensionality reduction (scikit-learn) and how is it used? I got much better results than by using other affinity options (such as 'euclidean', 'l1', 'l2' or 'manhattan'), however, I'm…