Questions tagged [dimensionality-reduction]

In machine learning and statistics, dimensionality reduction or dimension reduction is the process of reducing the number of random variables under consideration, and can be divided into feature selection and feature extraction.

422 questions
3
votes
1 answer

PCA Dimension reducion for classification

I am using Principle Component Analysis on the features extracted from different layers of CNN. I have downloaded the toolbox of dimension reduction from here. I have a total of 11232 training images and feature for each image is 6532. so the…
3
votes
0 answers

t-sne perplexity for small data set

I am using t-SNE to visualize cytometry data. Most of guides I found (https://distill.pub/2016/misread-tsne/) warn how the choice of perplexity hyperparameter can influence the result. However, my data set size is really small, always expecting…
3
votes
0 answers

Example for dimension reduction (SVD vs Random Projection) in R

I am learning about dimension reduction techniques in R. I take one image as input and I have reduced dimension using svd using this code library(raster) img <- raster("C:/Users/***/Pictures/pansy.jpg") img_flip <- flip(img, direction = "y")…
Siddhu
  • 1,188
  • 2
  • 14
  • 24
3
votes
1 answer

Any R implementation for dimension reduction using random projection?

I have a large p (~20K) and small n (~500) problem. The first thing I was thinking is dimension reduction. After trying PCA, robust PCA, ICA, removing highly correlated features, I was thinking to use Random Projection. However, there is no simple R…
whatsnext
  • 617
  • 7
  • 19
3
votes
2 answers

Perform clustering using t-SNE dimensionality reduction

The question is a matter of which should come first: a) the clustering or b) the dimensionality reduction algorithm? In other words, can I apply a pseudo (as it is not really) dimensionality reduction method like t-SNE and then use a clustering…
3
votes
1 answer

When using ICA rather than PCA?

I know that PCA and ICA both are used for dimensionality reduction and in PCA principal components are orthogonal (not necessarily independent) but in ICA they are independent. Can anybody please clarify when it is better to use ICA rather than PCA?…
starrr
  • 1,013
  • 1
  • 17
  • 48
3
votes
0 answers

How does ALS and SVD differ?

Do both ALS and SVD involve dimensional reductionality, and if so, how do the two methods differ? At a glance, I'm not sure why they're not the same.
cshin9
  • 1,440
  • 5
  • 20
  • 33
3
votes
1 answer

Fuzzy clustering using unsupervised dimensionality reduction

An unsupervised dimensionality reduction algorithm is taking as input a matrix NxC1 where N is the number of input vectors and C1 is the number of components for each vector (the dimensionality of the vector). As a result, it returns a new matrix…
3
votes
3 answers

Autoencoders for high dimensional data

I'm working on a project where I need to reduce the dimensionality of my observations and still have a significative representation of them. The use of Autoencoders was strongly suggested for many reasons but I'm not quite sure it's the best…
G4bri3l
  • 4,996
  • 4
  • 31
  • 53
3
votes
1 answer

What is correct implementation of LDA (Linear Discriminant Analysis)?

I found that the result of LDA in OpenCV is different from other libraries. For example, the input data was DATA (13 data samples with 4 dimensions) 7 26 6 60 1 29 15 52 11 56 8 20 11 31 8 47 7 52 6 …
3
votes
2 answers

Is dimensionality reduction reversible?

I have implemented a dimentionality reduction algorithm using ENCOG, that takes a dataset (call it A) with multiple features and reduces it to a dataset (B) with only one feature (I need that for time series analisys). Now my question is, I have a…
3
votes
1 answer

Which one should I use for dimension reduction with PCA in MATLAB, pcacov or eigs?

I'm trying to reduce my training set dimension from 1296*70000 to 128*70000. I wrote Below code: A=DicH; [M N]=size(A); mu=mean(A,2);%mean of columns Phi=zeros(M,N); C=zeros(M,M); for j=1:N Phi(:,j)=A(:,j)-mu; c=Phi(:,j)*(Phi(:,j))'; …
Mehran
  • 307
  • 1
  • 3
  • 15
2
votes
0 answers

UMAP on batch data

I have a dataset consisting of more than 300M records each with around 800 features. I have broken the dataset into 1000 CSV files (each around 2.5Gig). I want to use UMAP to reduce the 800 dimensions space to a lower dimensions space (e.g., 10).…
2
votes
1 answer

t-SNE for multiple datasets in R

I have 7 datasets, each one of them have two types of dataframe: Metadata, contains a super important column that shows who is a responder and who is not, and a dataframe about cell types. Sample using dput: This is an example from one of the…
Programming Noob
  • 1,232
  • 3
  • 14
2
votes
1 answer

Implementing sklearn PCA on limited number of variables in a pipeline

I'm setting up a machine learning pipeline to classify some data. One source of the data is a very good candidate for PCA and makes up the last n dimensions of the dataset. I would like to use PCA on these variables but not the preceding variables.…