Questions tagged [pca]

Principal component analysis (PCA) is a statistical technique for dimension reduction often used in clustering or factor analysis. Given any number of explanatory or causal variables, PCA ranks the variables by their ability to explain greatest variation in the data. It is this property that allows PCA to be used for dimension reduction, i.e. to identify the most important variables from amongst a large set possible influences.

Overview

Principal component analysis (PCA) is a statistical technique for dimension reduction often used in clustering or factor analysis. Given any number of explanatory or causal variables, PCA ranks the variables by their ability to explain greatest variation in the data. It is this property that allows PCA to be used for dimension reduction, i.e. to identify the most important variables from amongst a large set possible influences.

Mathematically, principal component analysis (PCA) amounts to an orthogonal transformation of possibly correlated variables (vectors) into uncorrelated variables called principal component vectors.

Tag usage

Questions on tag should be about implementation and programming problems, not about the statistical or theoretical properties of the technique. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics, machine learning and data analysis.

In scientific software for statistical computing and graphics, functions princomp and prcomp compute PCA.

2728 questions
0
votes
0 answers

Object detection, without prior classification

I'm looking for a method of detecting multiple, arbitrary, possibly overlapping objects in a dataset. An example image is shown below. I would like to find an algorithm to detect that there are 2 principle characters in the image, their approximate…
0
votes
0 answers

Generate a line in (x,y,z)-coords for n-samples. Check if sample n+1 has absolute distance to the generated line below certain threshold

I have an array of 50 samples of x,y,z-coordinates. These samples represent a linear movement of an arm. I want to do the following: Find the best fitting line in the 3d-coordinate system for 50 samples. Based on the new calculated line, check if a…
0
votes
1 answer

Make subplots from scatter plot using PCA values to visualize them better

I'm doing a PCA from my dataset that has a shape of (100, 103), and I want to do a PCA from the the first 60 variables, I followed this code from plotly https://plotly.com/python/pca-visualization/ import pandas as pd import plotly.express as…
Bertha
  • 55
  • 1
  • 6
0
votes
0 answers

Why the legend is not in the correct order in biplot? I have 2 target variables, can I add both at one pca with two different legend?

BIPLOTI run the below code but the legend is not in the correct order. Also, I have 2 target variables. It would be valuable if I added both variables with two legends. from sklearn.decomposition import PCA from sklearn.preprocessing import…
M MA
  • 1
  • 2
0
votes
0 answers

I am performing SVD on a matrix in Python and MATLAB. SVD's "V" values across columns are same in both (ignoring sign) apart from last column. Why?

SVD on the original matrix produces same results in both Python and MATLAB (ignoring the signs). However, when I center the matrix, the last column of V differs between Python and MATLAB. MATLAB…
0
votes
0 answers

How to compute variable contributions to the principal axes for RDA in R? (Erorr: rdacca can't be handled by factoextra)

data("mite") # Load mite species abundance data data("mite.env") # Load envdata # Hellinger transform the community data mite.spe.hel <- decostand(mite, method = "hellinger") mite.env <- mite.env[,1:2] mite.env$SoilCont <-…
Share
  • 395
  • 7
  • 19
0
votes
0 answers

Some PCA loadings of a given matrix in MATLAB and Python produce opposite signs. While the magnitude is the same, I require same signs for summation

I have a matrix X of which, upon centering, I perform SVD. While I get the same magnitude of the loadings, some of the columns produce opposite signs between MATLAB and Python. I understand the opposite signs do not matter as far as the PCA is…
0
votes
0 answers

Ellipsoids for PCA in MATLAB

I'm doing PCA on my data using MATLAB and want to make a 3D PCA plot with 95% ellipsoids for each category. My code is as follows: % Input data InputData = [7.72 6.73 3.33 0.12 0.06 -0.31 1; 8.92 8.22 4.56 0.06 0.72 0.01 1; …
STEMQs
  • 75
  • 1
  • 10
0
votes
1 answer

How to make 3D PCA plot with 95% ellipsoids in MATLAB

I want to make a 3D PCA plot with the first three principal components and to have a 95% confidence ellipsoid for each class (label). Here is my code: % Input data InputData = [7.72 6.73 3.33 0.12 0.06 -0.31 1; 8.92 8.22 4.56 0.06 0.72…
STEMQs
  • 75
  • 1
  • 10
0
votes
1 answer

PCA analysis: how can you identify which variable the first component corresponds to?

I'm running PCA with 31 variables, and I need to know which variable corresponds to the firs principal component and captures the most variance. Here is my code: mat=cov(df_15[11:41]) pca=princomp(covmat=mat) summary(pca) But the output names the…
Johnny
  • 59
  • 5
0
votes
0 answers

near-alignment of PC scores and variable vector (biplot)

Here is the screenshot of my PCA biplot. As is evident there is an alignment (although not perfect), of my PC scores and the variable vector . Is there any interpretation of this alignment? PCA biplot Ray this is a question of graph interpreation,…
0
votes
0 answers

Why PCA code is not running, the matrix is 0х0?

df_scaled <- as.data.frame(scale(df)) features <- df_scaled[, -10] target <- df_scaled[, 10] pc <- PCA(features, graph = FALSE) print(pc) Error in eigen(crossprod(t(X), t(X)), symmetric = TRUE) :0х0 matrix where is the error? I have 19178 obs. of …
0
votes
1 answer

Using different scales for secondary axis for ggplot function in R

I am trying to create a biplot from iris data set using ggplot2 package. I have used below code to generate the biplot: library(ggplot2) library(devtools) # Load iris dataset data(iris) # Run PCA and extract scores and loadings iris_pca <-…
Farhan
  • 57
  • 5
0
votes
0 answers

How to use principal component analysis (PCA) to answer eda questions and how to use principal component for linear regression?

backpainX <- backpain_data[7:10] backpain.pca <- prcomp(backpainX, center=TRUE, scale=TRUE) summary(backpain.pca) backpain.pca$rotation How can I perform PCA to combine the four pain variables (neck, thoracic, lumbar, sacral) and take the…
Stebs
  • 11
  • 2
0
votes
0 answers

How to plot loadings of LD1 features after applying PCA and LDA on a dataset?

I am working with a dataset, X, and have applied PCA to reduce its dimensionality down to three principal components (PC1, PC2, and PC3). Following that, I used LDA to maximize the separation between two classes, resulting in LD1. My challenge is to…
vdu16
  • 123
  • 10
1 2 3
99
100