In machine learning and statistics, dimensionality reduction or dimension reduction is the process of reducing the number of random variables under consideration, and can be divided into feature selection and feature extraction.
Questions tagged [dimensionality-reduction]
422 questions
6
votes
0 answers
tensorflow embedding projector t-sne algorithm difference from other implementation?
I've been playing with the tensorflow standalone embedding projector (http://projector.tensorflow.org/) and found it a very helpful tool for visualization. However, when I try to replicate the t-sne result using other implementations (e.g., Rtsne,…

jsl
- 61
- 1
6
votes
1 answer
Strange iteration results "error is nan" and RuntimeWarning using t-SNE
I am using t-SNE python implementation for dimensionality reduction on X which contains 100 instances each described by 1024 parameters for cnn visualization.
X.shape = [100, 1024]
X.dtype = float32
When I run :
Y = tsne.tsne(X)
The first warning…

Julep
- 760
- 1
- 6
- 18
6
votes
1 answer
In natural language processing (NLP), how do you make an efficient dimension reduction?
In NLP, it's always the case that the dimension of the features are very huge. For example, for one project at hand, the dimension of features is almost 20 thousands (p = 20,000), and each feature is a 0-1 integer to show whether a specific word or…

zxzx179
- 187
- 7
6
votes
1 answer
In preprocessing data with high cardinality, do you hash first or one-hot-encode first?
Hashing reduces dimensionality while one-hot-encoding essentially blows up the feature space by transforming multi-categorical variables into many binary variables. So it seems like they have opposite effects. My questions are:
What is the benefit…

Newbie
- 91
- 2
- 5
6
votes
1 answer
Circular Dimensionality Reduction?
I want dimensionality reduction such that dimensions it returns are circular.
ex) If I reduce 12d data to 2d, normalized between 0 and 1, then I want (0,0) to be as equally close to (.1,.1) as (.9,.9).
What is my algorithm? (bonus points for…

Cortexelus
- 341
- 1
- 7
5
votes
1 answer
Fast ICA using scikit learn- reconstruction error analysis
I am trying to use fastICA procedure in scikitLearn. For validation purposes I tried to understand the difference between PCA and ICA based signal reconstruction.
The original number of observed signals are 6 and I tried to use 3 reconstruction…

schuler
- 175
- 2
- 4
- 12
5
votes
2 answers
Linear Discriminant Analysis inverse transform
I try to use Linear Discriminant Analysis from scikit-learn library, in order to perform dimensionality reduction on my data which has more than 200 features. But I could not find the inverse_transform function in the LDA class.
I just wanted to…

Babak Hashemi
- 327
- 3
- 11
5
votes
1 answer
How to get `skbio` PCoA (Principal Coordinate Analysis) results?
I'm looking at the attributes of skbio's PCoA method (listed below). I am new to this API and I want to be able to get the eigenvectors and the original points projected onto the new axis similar to .fit_transform in sklearn.decomposition.PCA so I…

O.rka
- 29,847
- 68
- 194
- 309
5
votes
0 answers
Visualizing distance matrix using tSNE - Python
I've computed a distance matrix and I'm trying two approach to visualized it.
This is my distance matrix:
delta =
[[ 0. 0.71370845 0.80903791 0.82955157 0.56964983 0. 0. ]
[ 0.71370845 0. 0.99583115 1. …

pceccon
- 9,379
- 26
- 82
- 158
5
votes
1 answer
Selecting the components showing the most variance in PCA
I have a huge data set (32000*2500) that I need for training. This seems to be too much for my classifier, so I decided to do some reading on dimensionality reduction and specifically into PCA.
From my understanding, PCA selects the current data and…

StuckInPhDNoMore
- 2,507
- 4
- 41
- 73
5
votes
1 answer
R - Get a matrix with the reduced number of features with SVD
I'm using the SVD package with R and I'm able to reduce the dimensionality of my matrix by replacing the lowest singular values by 0. But when I recompose my matrix I still have the same number of features, I could not find how to effectively delete…

ClydeX
- 81
- 2
- 5
5
votes
1 answer
How to add an image thumbnail as(or beside) a plot marker in MATLAB?
I am running Isomap Dimensionality reduction in MATLAB on a series of images. I want to plot the image's thumbnail beside the point on the manifold corresponding to it.
I am currently using 2 differnt isomaps http://isomap.stanford.edu/ and…

web_ninja
- 2,351
- 3
- 22
- 43
4
votes
1 answer
Reduced dimensions visualization for true vs predicted values
I have a dataframe which looks like this:
label predicted F1 F2 F3 .... F40
major minor 2 1 4
major major 1 0 10
minor patch 4 3 23
major patch 2 1 11
minor minor …

Brie MerryWeather
- 132
- 3
- 13
4
votes
2 answers
After performing t-SNE dimentionality reduction, use k-means and check what features contribute the most in each individual cluster
The following plot displays the t-SNE plot. I can show it here but unfortunately, I can't show you the labels. There are 4 different labels:
The plot was created using a data frame called scores, which contains approximately 1100 patient samples…

Programming Noob
- 1,232
- 3
- 14
4
votes
1 answer
What is difference between xgboost.plot_importance() and model.feature_importances_ XGBclassifier
What is difference between xgboost.plot_importance() and model.feature_importances_ in XGBclassifier.
so here I make some dummy data
import numpy as np
import pandas as pd
# generate some random data for demonstration purpose, use your original…

Farah Amirah
- 59
- 1
- 6