Questions tagged [dimensionality-reduction]

In machine learning and statistics, dimensionality reduction or dimension reduction is the process of reducing the number of random variables under consideration, and can be divided into feature selection and feature extraction.

422 questions
0
votes
1 answer

How to take random projections in LSH when there are both Numerical and Categorical Data?

Note : Using LSH for a Nearest Neighbor Query Assuming the data set has 5 features (f1,f2,..,f5) Where the first 2 are Numerical and 3 are categorical. And one or many of these categories maybe something like username or subject which would be quite…
0
votes
0 answers

Using Matlab, what's the best way to import a set of images into a data matrix so that I can run dimensionality reduction algorithms efficiently?

I'm currently using the CIFAR-10 dataset, and I'm trying to run some dimensionality reduction algorithms on it. It's a bunch of 32x32 colour images, so I'm currently importing the data by putting each 32x32x3 image into one column, like so: X =…
user3457834
  • 314
  • 3
  • 12
0
votes
1 answer

Is it possible to reverse the transformation of KMeans in sklearn?

After clustering a dataset and then transforming the data to the distance from the centroids using sklearn.cluster.KMeans, is it possible to reverse the transformation, given the centroids, getting back the original features?
0
votes
0 answers

Retrieve candidate attributes for a node in Decision Tree using R

I am using R for creating a decision tree using CART. I did it using feature_vectors <- read.table("C:/Users/DVS/Desktop/TagMe!-Data/Train/feature_vectors.txt", quote="\"") set.seed(1234) ind <- sample(2, nrow(winequality.red), replace=TRUE,…
dvs
  • 511
  • 1
  • 10
  • 27
0
votes
1 answer

Reuse dimensionality reduction after designing model with Matlab

I'm using a binary classification with SVM and MLP for financial data. My input data has 21 features so I used dimensionally reduction methods for reducing the dimension of data. Some dimensionally reduction methods like stepwise regression report…
Eghbal
  • 3,892
  • 13
  • 51
  • 112
0
votes
1 answer

Matrices kernelpca

we are working on a project and trying to get some results with KPCA. We have a dataset (handwritten digits) and have taken the 200 first digits of each number so our complete traindata matrix is 2000x784 (784 are the dimensions). When we do KPCA we…
0
votes
1 answer

What is the effect of randomSeed on dimensionality reduction by random projection?

1) What is the effect of randomSeed parameter on dimensionality reduction by random projection in weka? 2) Secondly it is said that dimensionality reduction does not loss information, But I have observed that if we set the numberOfAttributes…
0
votes
1 answer

How can I find a projection to preserve the relative value of inner product?

I want to do dimension reduction with a 100-dimension vector v, then get a 10-dimension vector v'. And the property below must be preserved: For arbitrary vector w1, w2(100-dimension) if v * w1 > v * w2(* rep inner product) After reduction.... v' *…
xunzhang
  • 2,838
  • 6
  • 27
  • 44
0
votes
1 answer

How do i obtain only the first principal component in MATLAB?

For certain measurements i need to obtain only the numeric value of the first principal component from the matrix. Can someone please tell me how do i go about it?
Sid
  • 249
  • 5
  • 16
0
votes
1 answer

Reduce the dimensions of a dataset after applying pca on it in R

My question is how to use the principal components obtained using R. Once you get the principal components, how do we use it to reduce the dimensions? I have a data_set containing 6 variables, I need to cluster it using k-means. K-means gives me a…
N2M
  • 199
  • 1
  • 15
0
votes
1 answer

Reducing dimensionality of data using SOMs

As a part of a school project, I had to read a paper by Steven Lawrence about using SOMs and CCNs to detect faces. For those of you who are curious, heres the paper: http://clgiles.ist.psu.edu/papers/UMD-CS-TR-3608.face.hybrid.neural.nets.pdf On…
0
votes
1 answer

how could I know which dimensions are the principle component?

I use matlab's princomp function to do PCA. From my understanding, I could check the latent to decide how many dimensions I need. [coeff, score, latent, t2] = princomp(fdata); cumsum(latent)./sum(latent); And by using trainMatrix = coeff(:,1:10)…
Freya Ren
  • 2,086
  • 6
  • 29
  • 39
0
votes
1 answer

Is there good library to do NMF fast?

I have a sparse matrix whose shape is 570000*3000. I tried nima to do NMF (using the default nmf method, and set max_iter to 65). However, I found nimfa very slow. Have anyone used a faster library(can be used by Python/R) or software to do NMF?
Hanfei Sun
  • 45,281
  • 39
  • 129
  • 237
0
votes
2 answers

Deciding about dimensionality reduction with PCA

I have 2D data (I have a zero mean normalized data). I know the covariance matrix, eigenvalues and eigenvectors of it. I want to decide whether to reduce the dimension to 1 or not (I use principal component analysis, PCA). How can I decide? Is…
kamaci
  • 72,915
  • 69
  • 228
  • 366
0
votes
1 answer

How to do Latent Semantic Analysis on a very large dataset

I am trying to run LSA or Principal component analysis on a very large dataset, about 50,000 documents and over 300,000 words/terms, to reduce the dimensionality so I can graph the documents in 2-d. I have tried in Python and in MATLAB but my…