1

In the following code used to plot the parallel coordinates (reference), I want to display only limited numbers of the legend for the column "5". I use column 5 to group in the plot. Column 5 have multiple unique values, when I plot, all the values gets printed as legends (see attached screenshot).

Is there a way I could modify following code to display only few values in legend? For example in the plot I want to discard values >=410.

suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(viridis))
suppressPackageStartupMessages(library(ggplot2))
suppressPackageStartupMessages(library(ggthemes))
suppressPackageStartupMessages(library(hrbrthemes))
suppressPackageStartupMessages(library(GGally))
suppressPackageStartupMessages(library(gridExtra))

plt2<-suppressWarnings(print(ggparcoord(data,
    columns = c(1,2,3, 4),
    groupColumn = 5,splineFactor = 20, #order = "skewness",#splineFactor = TRUE, order = "skewness",
    scale="globalminmax",
    showPoints = TRUE, title = "All Fts.",
    alphaLines = 0.5) )) +#+ scale_color_viridis(discrete=TRUE)
  theme_economist_white(base_family="Arial Narrow", gray_bg = FALSE)+
  scale_colour_economist()+
  theme(axis.line = element_line(color='black'), #legend.position = "none",
    plot.background = element_blank(),
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank(),
    panel.border = element_blank())+scale_x_discrete(limit=c('Min Rows', 'Max Depth', 'MTries','AUC'))+
  guides(colour = guide_legend(override.aes = list(alpha = 5)))
#+scale_colour_manual(limits = c("10", "60", "110", "160", "210", "260", "310", "360","410"))

enter image description here

stefan
  • 90,330
  • 6
  • 25
  • 51
lpt
  • 931
  • 16
  • 35

1 Answers1

0

Perhaps you could filter the groups of interest before you plot? E.g.

library(tidyverse)
# install.packages("GGally")
library(GGally)

# example data
df <- iris

# add a row to be removed
df2 <- df %>%
  add_row(Sepal.Length = 5, Sepal.Width = 3,
          Petal.Length = 5, Petal.Width = 2, Species = "")
tail(df2)
#>     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#> 146          6.7         3.0          5.2         2.3 virginica
#> 147          6.3         2.5          5.0         1.9 virginica
#> 148          6.5         3.0          5.2         2.0 virginica
#> 149          6.2         3.4          5.4         2.3 virginica
#> 150          5.9         3.0          5.1         1.8 virginica
#> 151          5.0         3.0          5.0         2.0

# normal plot
ggparcoord(data = df2,
           columns = 1:4,
           groupColumn = 5,
           splineFactor = 20)


# keep the groups of interest
ggparcoord(data = df2 %>%
             filter(Species %in% c("setosa", "virginica", "versicolor")),
           columns = 1:4,
           groupColumn = 5,
           splineFactor = 20)

Created on 2023-08-30 with reprex v2.0.2

Would this approach work with your actual data?

jared_mamrot
  • 22,354
  • 4
  • 21
  • 46