I am running pcas on groups in a data set using dplyr
pipelines. I am starting with group_split
, so am working with a list. In order to run the prcomp()
function, only the numeric
columns of each list can be included, but I would like the factor
column brought back in for plotting at the end. I have tried saving an intermediate output using {. ->> temp}
partway through the pipeline, but since it is a list, I don't know how to index the grouping column when plotting.
library(tidyverse)
library(ggbiplot)
iris %>%
group_split(Species, keep = T) %>% #group by species, one pca per species
{. ->> temp} %>% # save intermediate output to preserve species column for use in plotting later
map(~.x %>% select_if(is.numeric) %>% select_if(~var(.) != 0) %>%
prcomp(scale. = TRUE))%>% #run pca on numeric columns only
map(~ggbiplot(.x), label=temp$Species)#plot each pca, labeling points as species names form the temporary object
This works to produce one pca plot for each species in the iris
data set, but since temp$species = NULL
, the points are not labelled.