How to apply principal component analysis to standardised multicentric data?

Question

I have a question about principal component analysis.

I am working with a dataset with 2 cohorts from 2 different centres. From each center I have a control group and 2 patient subgroups (drug-resistant and drug-responsive). My objective is to analyse neurocognitive data, which all subjects received during the study. The problem is, the cognitive tests applied differ slightly across centres. I therefore standardized the raw score values in each patient subgroup relative to the control group of their respective center. Still I am left with a big dataset of z-scores and would like to further reduce dimensionality with PCA.

My question is, does it make sense to apply PCA after standardising data this way? (not sure if I can call them z-scores has I standardised them relative to the mean and standard deviation of the control of their respective centre and not of the entire sample!) The mean of the columns will therefore not be = 0. Would it still be legitimate to apply PCA? And do you think I should scale the variables again?

Any suggestions or comments are much appreciated!

Best wishes, Bernardo

How to apply principal component analysis to standardised multicentric data?

0 Answers0