Questions tagged [pca]

Principal component analysis (PCA) is a statistical technique for dimension reduction often used in clustering or factor analysis. Given any number of explanatory or causal variables, PCA ranks the variables by their ability to explain greatest variation in the data. It is this property that allows PCA to be used for dimension reduction, i.e. to identify the most important variables from amongst a large set possible influences.

Overview

Mathematically, principal component analysis (PCA) amounts to an orthogonal transformation of possibly correlated variables (vectors) into uncorrelated variables called principal component vectors.

Tag usage

Questions on tag pca should be about implementation and programming problems, not about the statistical or theoretical properties of the technique. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics, machine learning and data analysis.

In scientific software r for statistical computing and graphics, functions princomp and prcomp compute PCA.

2728 questions

votes

1 answer

How to visualize a stepwise movement of PCA variables calculated repeatedly using different subsets of the same data to gain insights?

Imagine I have the dataset below for 34 subjects (I randomly sliced 100 observations from it because of line limitation here in the body). Each subject has multiple observations for different time points and regions. I would like to compare two…

ggplot2 shiny plotly tidyverse pca

asked Jul 18 '23 at 16:26

doctorate

1,381
1
19
43

votes

2 answers

Adding lines to connect separate cluster in a chart

I saw this neat principal component analysis graph online, where they had lines connecting each cluster to a center point. I used an example data set to show that I have made it up to adding the ellipses, but after looking online, I think this PCA…

r pca

asked Jul 17 '23 at 23:59

dkcongo

votes

1 answer

How do I find the direction of greatest variance in a matrix?

I have rainfall data (corresponding to latitude, longitude, and amount of rainfall at that latitude and longitude.) I plotted this data using a 2D matrix, with matrix(i,j) corresponding to the i'th and j'th (sorted) latitudes and longitudes in my…

python matrix pca variance

asked Jul 11 '23 at 21:39

requiemman

votes

0 answers

Meaning Of High Variance With Few Components In PCA

I am new to PCA and currently applying it to a dataset where each sample consists of 500 measurements points. When applying PCA, the cumulative variance of the top 5 components is ~99%, which puzzles me. Could this dataset be described with 5…

pca

asked Jul 06 '23 at 11:44

Mark wijkhuizen

votes

0 answers

Error for k-fold cross-validation and PCR in R with simulated data

For my thesis I am seeing whether 5-fold cross-validation can be used to find the optimal number of principal component in PCR for time series data. I am using a 3 factor model. However, when I try to run the PCR code I get an error as the data…

r pca cross-validation mse

asked Jun 27 '23 at 10:38

Mieska

votes

0 answers

Clustering on mixed data with related variables

I'm working with a mixed dataset (unique at the firm-year level) with related variables that look something like the following (but with many more variables of a similar nature), where: "sec" is the sector the firm belongs to and doesn't change…

r cluster-analysis pca data-processing unsupervised-learning

asked Jun 27 '23 at 10:37

jess0192

votes

1 answer

What is the line in a 3D pca and its meaning?

Recently, I focused on 3D PCA. And I know how to produce 3D PCA plots through different packages in R, such as plotly, rgl and so on. But I have a small question from the picture below: I don't know how to add vertical lines in R just as the picture…

r pca prcomp

asked Jun 26 '23 at 13:09

花落思量错

votes

0 answers

Using functional Principal components to make predictions

I would like to use FPCA to reconstruct a partial curve. I have 20 temperature curves and each curve contains 365 days. I would like to do FPCA on 15 curves and extract functional PCA. The other 5 curves only have data up to 100 days. I would like…

python functional-programming pca

asked Jun 26 '23 at 03:46

bayoote

votes

1 answer

Getting pca.explained_variance_ratio_ for all components without doing PCA twice

I understand that explained_variance_ratio_ can be obtained easily using PCA but will be restricted to the contribution from the first n_components. I was wondering if explained_variance_ratio_ can be obtained for all components without doing PCA…

python scikit-learn pca

asked Jun 25 '23 at 22:07

xinit

votes

1 answer

How to display Prince PCA Eigenvectors

I am looking for a way to display the eigenvectors on the prince library. Could you please tell me what is the command as I am not finding it in the documentation (eigenvalues_ for eigenvalues) :)

pca

asked Jun 16 '23 at 09:54

zazoupile

votes

1 answer

What is the region produced by ggforce package (geom_mark_ellipse)

Here is my reproduceable data and code: dd<-structure(list(chr1_1005501 = c(0.597222222222222, 0.75, 0.775, 0.732456140350877, 0.860696517412935, 0.777777777777778, 0.654545454545455, …

r pca confidence-interval

asked Jun 15 '23 at 10:13

花落思量错

votes

0 answers

Pca of a vector

I have a time series data with 4 columns (time, radial, axial and temp) and 900 data points for each column. I have 4 num classes with 10 samples in each class. I want to convert each sample to a vector of dimension 1*2700, perform pca, then plot…

google-colaboratory vectorization pca

asked Jun 12 '23 at 19:45

Zlatan

votes

0 answers

How to Adjust Weights Using PCA?

I have a data set composed of 5 indicators D1, D2, D3, D4 and D5, and their weighted sum DS, which I use to create the binary variable EP. library(tidyverse) weight <- rep(1/5,5) names(weight) <- c("D1","D2","D3","D4","D5") data <- data.frame(D1 =…

r pca

asked Jun 09 '23 at 20:57

Saïd Maanan

votes

1 answer

extract principal components from PCA in missMDA

I'm performing a multiple imputation PCA on a dataset that has several missing values in one variable, and I want to extract the first principal component to use in another model, but I can't figure out how to extract it from the results. #…

r pca

asked Jun 06 '23 at 19:00

tnt

1,149
14
24

votes

1 answer

Get values of red arrows created by biplot()

I wanted to determine the values of those red arrows created by the biplot() with the usage of prcomp() function. I would like to determine arrows length and position to compare them more accurately. Here is the code that I use: df <- read.csv(file…

r statistics pca

asked Jun 06 '23 at 18:29

Axton

Prev 1 2 3

…

99 100 Next