0

I'm trying to predict GDP per capita from about 50 variables, and the PCA shows that the two principal components explain about 40% of the variance. But I want it to show how much of the variance in GDP per capita specifically, it explains. Because when I graph GDP per capita against predicted using the first two principal components, I get an R^2 of about 90%.

When I try setting up my PCA on R, I don't know how to specify that the point of this is to explain variation in GDP per capita (not the other variables.)

I have: pca <- prcomp(allexceptgdppc, center = TRUE, scale = TRUE) zpca <- predict(pca, all)

Maybe I need to specify the variable GDP per capita, instead of all, but it says I must use a dataframe.

Any help would be amazing, thanks so much.

  • 1
    I don't think this belongs on SO, it could probably go on another SE network. That said, I think you would need to create a model, the inputs for that model should be the PCA components you want to use and the output should be the GDP. Fit the model, then use it for predictions. PCA by itself is a way of dimensional reduction. – matt Jun 06 '23 at 11:36
  • 1
    Maybe you could make this more appropriate by including a complete example, with some data, where you're just missing the prediction. It sounds like you're using R? – matt Jun 06 '23 at 11:37
  • Please provide enough code so others can better understand or reproduce the problem. – Community Jun 07 '23 at 00:45

0 Answers0