I have a data set composed of 5 indicators D1
, D2
, D3
, D4
and D5
, and their weighted sum DS
, which I use to create the binary variable EP
.
library(tidyverse)
weight <- rep(1/5,5)
names(weight) <- c("D1","D2","D3","D4","D5")
data <- data.frame(D1 = sample(c(0,1), 100, replace = TRUE),
D2 = sample(c(0,1), 100, replace = TRUE),
D3 = sample(c(0,1), 100, replace = TRUE),
D4 = sample(c(0,1), 100, replace = TRUE),
D5 = sample(c(0,1), 100, replace = TRUE))
data <- data |> mutate(DS = rowSums(across(D1:D5, ~ .x * weight[cur_column()])),
EP = if_else(DS >= 0.4, 1, 0))
I use equal weights to calculate DS
, but I know the weights should not be equal since the variables are of different importance to EP
.
I have seen this presentation here about calibrating the weights using PCA, and I would like to do the same for my data.
Can someone please show me how to use PCA in order to calibrate the weights according to the importance of the variables D1:D5
to the variable EP
? Thank you very much in advance.