1

I know how to produce a PCA plot through ggbiplot and this package works well.

But now I want to modify some specific points, such as their color, size and especially adding circles around some points but not cover them by geom_encircle() function.

Here is my reproducible example code below:

#load required packages
library(ggplot2)
library(devtools)
library(ggbiplot)

#load dataset
data(iris)

#perform principal component analysis
pca = prcomp(iris[ , 1:4], scale=T)

#define classes, generate & view PCA biplot
class = iris$Species
ggbiplot(pca, obs.scale = 1, var.scale = 1, groups = class, circle = FALSE)+
  geom_point(size = 3,aes(color = class))+
  geom_point(data=iris[iris$Species=="setosa",],pch=21, fill=NA, size=2, colour="black", stroke=2)

However, error information appeared:

Error in `geom_point()`:
! Problem while computing aesthetics.
i Error occurred in the 5th layer.
Caused by error in `FUN()`:
! object 'xvar' not found
Run `rlang::last_trace()` to see where the error occurred.

I may know it is caused by data in geom_point() which is not consistent to pca.

But I don't know how should I set the data in geom_point()

So I hope somebody could give me some advice or solutions.

Thanks in advance.

花落思量错
  • 352
  • 1
  • 11
  • May want to note that `ggbiplot` is not on CRAN, and can be installed via `install_github("vqv/ggbiplot")` ... – Ben Bolker Jul 05 '23 at 19:40

2 Answers2

2

You can do this in a hacky way by using ggplot_build() to retrieve the data frame that was constructed by ggbiplot.

gg0 <- ggplot(data=data,aes(x=data[,1],y=data[,2]))+
  geom_point(size = 3,aes(color = class))
ggb <- ggplot_build(gg0)

ggb$data is a list with a data frame for each layer of the plot. By poking around a bit we can figure out that the geom_point layer is the last (fourth), i.e. ggb$data[[4]]. All we need from this is the x and y coordinates, which we can combine with the original data set (hoping that row order is preserved, there weren't any incomplete cases discarded, etc.)

my_data <- cbind(iris, ggb$data[[4]][c("x", "y")])
m2 <- subset(my_data, Species == "setosa")
gg0 + 
   geom_encircle(data = m2, aes(x = x, y = y)) +
   geom_point(data=m2, aes(x=x,y=y),
              pch=21, fill=NA, size=2, colour="black", stroke=2)

enter image description here

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
0

Finally, I get a good idea to produce the plot.

But my method is transfer ggbiplot() to ggplot: Here is my code

#load required packages
library(ggplot2)
library(devtools)
library(ggbiplot)

#load dataset
data(iris)

#perform principal component analysis
pca = prcomp(iris[ , 1:4], scale=T)
class = iris$Species
data<-data.frame(pca$x,class = iris$Species)
#define classes, generate & view PCA biplot

ggplot(data=data,aes(x=data[,1],y=data[,2]))+
  geom_point(size = 3,aes(color = class))+
  geom_point(data=data[data$class %in% c("setosa"),],aes(PC1,PC2),pch=21, fill=NA, size=2, colour="black", stroke=2)

It works well but I don't know how to do it based on the original ggbiplot () function

This method is good, and it is enough for my request.

But I still want to know who can help me with other similar method.

Thanks in advance.

enter image description here

花落思量错
  • 352
  • 1
  • 11