0

I need to do a PCA on a big dataset with multiple variables. But for some of this variables, I have only one constant value fo each locations. My question is : Will this repetitive value change and influence more my PCA ? Is there a better way than PCA to mix a dataset that have multiple values for a location with another one with only on value per location ? It is important for me to include both variables.

Here is a simplified dataset to illustrate my problem (factor is the repetitive and constant value, where temp and sal are values changing over time):

temp <- sample(1:30, 40, replace=T)
sal <- sample(30:35, 40, replace=T)
factor <- c(rep(9, 10),rep(5, 10),rep(7, 10),rep(1, 10))
zone <- c(rep("A", 10),rep("B", 10),rep("C", 10),rep("D", 10))
d <- data.frame(zone, factor, temp, sal)

In my case, there is much more variables of each type, so I need to do a PCA or a similar type of analysis.

Mimosa
  • 47
  • 4
  • 1
    If you need help choosing how to fit statistical models, you should ask for help at [stats.se] instead. You are likely to get better help there. This is not really a specific programming question that's appropriate for Stack Overflow. – MrFlick Jun 01 '23 at 14:04

0 Answers0