2

I have horizontal dots plot visualized by ggPlotly. 3 numerical variables are put on plot. everything works nice:

library(ggplot2)

    df <- data.frame (origin = c("A","B","C","D","E","F","G","H","I","J"),
                  Percentage = c(23,16,32,71,3,60,15,21,44,60),
                  rate = c(10,12,20,200,-25,12,13,90,-105,23),
                  change = c(10,12,-5,12,6,8,0.5,-2,5,-2))

plt <- ggplot(df, aes(x = rate, y = factor(origin, rev(origin)))) +
  geom_segment(aes(x = (min(rate,change)-4), xend = (max(rate,change)+4),
                   y = origin, yend = origin), color = 'gray') +
  geom_vline(xintercept = 0, linetype = 2, color = 'gray') +
  #geom_vline(xintercept =17, linetype = 1, color = 'black') +
  geom_point(aes(fill = 'rate'), shape = 21, size = 10, color = NA) +
  geom_text(aes(label = rate, color = 'rate')) +
  geom_point(aes(x = change, fill = 'change'), 
             color = NA, shape = 21, size = 10) +
  geom_text(aes(label = change, x = change, color = "change")) +
  geom_point(aes(x = (max(rate,change)+5.5), fill = "Percentage"), color = "gray",
             size = 10, shape = 21) +
  geom_text(aes(x = (max(rate,change)+5.5), label = paste0(Percentage, "%")),size=3)+
  theme_minimal(base_size = 16) +
  scale_x_continuous(labels = ~paste0(.x, '%'), name = NULL) +
  scale_fill_manual(values = c('#aac7c4', '#5f9299','black')) +
  scale_color_manual(values = c("black", "white")) +
  theme(panel.grid = element_blank(),
        axis.text.y = element_text(color = 'gray50')) +
  labs(color = NULL, y = NULL, fill = NULL)+
  theme(axis.title = element_text(size=15), legend.title = element_text(size=2)) 

plt <- ggplotly(plt)

#customize legend
plt$x$data[[3]]$name <- plt$x$data[[3]]$legendgroup <-
  plt$x$data[[4]]$name <- plt$x$data[[4]]$legendgroup <- "rate"
plt$x$data[[5]]$name <- plt$x$data[[5]]$legendgroup <-
  plt$x$data[[6]]$name <- plt$x$data[[6]]$legendgroup <- "change"
plt$x$data[[7]]$name <- plt$x$data[[7]]$legendgroup <-
  plt$x$data[[8]]$name <- plt$x$data[[8]]$legendgroup <- "Percentage"

plt

However when I activate (remove #) geom_vline(xintercept =17, linetype = 1, color = 'black') code line, in order to add vertical line on plot, hiding variables from the legend does not work properly. For instance if we hide 'change' variable: numbers of 'rate' are disappeared while some of them are still shown. I think the solution should be found in plt$x$data. In addition, I want to order the categorical variable "origin descending from top to down by percentage, for example, if J has the highest percentage it should be on the top and also, I don't if that's possible but I want to keep A always on the bottom in ranking.

enter image description here

2 Answers2

1

Ordering the origin column by the percentages is straightforward. This is done at the data level, by converting origin to a factor whose levels are determined by the value of Percentage:

df$origin <- factor(df$origin, df$origin[order(df$Percentage)])

The reason that strange things were happening with your customized legend is that you added a layer before some of your existing layers, which throws off the indexing you are using to modify the legend groups at the end. The easiest fix for this is to draw the line after your existing layers:

plt <- ggplot(df, aes(x = rate, y = factor(origin, rev(origin)))) +
  geom_segment(aes(x = (min(rate,change)-4), xend = (max(rate,change)+4),
                   y = origin, yend = origin), color = 'gray') +
  geom_vline(xintercept = 0, linetype = 2, color = 'gray') +
  geom_point(aes(fill = 'rate'), shape = 21, size = 10, color = NA) +
  geom_text(aes(label = rate, color = 'rate')) +
  geom_point(aes(x = change, fill = 'change'), 
             color = NA, shape = 21, size = 10) +
  geom_text(aes(label = change, x = change, color = "change")) +
  geom_point(aes(x = (max(rate,change)+5.5), fill = "Percentage"), 
             color = "gray", size = 10, shape = 21) +
  geom_text(aes(x = (max(rate,change)+5.5), label = paste0(Percentage, "%")),
            size = 3)+
    geom_vline(xintercept =17, linetype = 1, color = 'black') +
  theme_minimal(base_size = 16) +
  scale_x_continuous(labels = ~paste0(.x, '%'), name = NULL) +
  scale_fill_manual(values = c('#aac7c4', '#5f9299','black')) +
  scale_color_manual(values = c("black", "white")) +
  theme(panel.grid = element_blank(),
        axis.text.y = element_text(color = 'gray50')) +
  labs(color = NULL, y = NULL, fill = NULL)+
  theme(axis.title = element_text(size=15), legend.title = element_text(size=2)) 

plt <- ggplotly(plt)

Now you can customize the legend groups exactly as before:

#customize legend
plt$x$data[[3]]$name <- plt$x$data[[3]]$legendgroup <-
  plt$x$data[[4]]$name <- plt$x$data[[4]]$legendgroup <- "rate"
plt$x$data[[5]]$name <- plt$x$data[[5]]$legendgroup <-
  plt$x$data[[6]]$name <- plt$x$data[[6]]$legendgroup <- "change"
plt$x$data[[7]]$name <- plt$x$data[[7]]$legendgroup <-
  plt$x$data[[8]]$name <- plt$x$data[[8]]$legendgroup <- "Percentage"

plt

enter image description here

If you want the line to be behind all the points and text, then keep your existing plotting code as it was, and increment all the indices in your legend grouping code:

#customize legend
plt$x$data[[4]]$name <- plt$x$data[[4]]$legendgroup <-
  plt$x$data[[5]]$name <- plt$x$data[[5]]$legendgroup <- "rate"
plt$x$data[[6]]$name <- plt$x$data[[6]]$legendgroup <-
  plt$x$data[[7]]$name <- plt$x$data[[7]]$legendgroup <- "change"
plt$x$data[[8]]$name <- plt$x$data[[8]]$legendgroup <-
  plt$x$data[[9]]$name <- plt$x$data[[9]]$legendgroup <- "Percentage"
Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
  • https://stackoverflow.com/questions/72870066/r-how-to-customize-sankey-plot-in-ggplotly –  Jul 05 '22 at 13:13
1

If you still wanted A at the bottom (although @Allan Cameron's answer looks great!), this will order by percentages and it will keep A at the bottom.

Keeping your plot and data as it is in your question, I started with building the plot.

plt2 <- plotly_build(plt)

To reorder the values by percentages, other than 'A', I reordered the data by Percentage added row numbers, and ordered it to match the order in the plot. I then used this to reorder the y-axis in the graph. I left my checks and balances in this code, so it may look like a lot, but a lot of it is the validation.

# determine the rearrangement
nOrder = df %>% 
  filter(origin != "A") %>% 
  arrange(Percentage) %>%   # desired order (other than A*)
  mutate(rn = 2:nrow(df)) %>%
  arrange(origin)
# add A as last
nOrder = rbind(c(unlist(df[df$origin == "A", ], use.names = F),
                 1), nOrder) %>% 
  mutate(across(c(Percentage, rn), as.integer))
# take a look
str(nOrder)

# create the vector with the order modifier
gimme <- unlist(nOrder$rn, use.names = F)

# expected order
(eo = append("A", df[df$origin != "A", ] %>% arrange(Percentage) %>% 
               select(origin) %>% unlist(use.names = F)))

# validgate gimme is set correctly
plt2$x$layout$yaxis$categoryarray
# check
plt2$x$layout$yaxis$categoryarray[order(gimme)]
all.equal(plt2$x$layout$yaxis$ticktext[order(gimme)], eo) # TRUE

Now it's time to reorder the y-axis. There are a few things that have to be changed due to the ggplot <-> plotly translation. range (or it will hide the A and add useless whitespace at the top) and the type need to be changed, along with the order. The order is set with the parameter categoryarray.

# finally change the plot's y-axis
plt2 %>% layout(yaxis = list(range = c(-.5, 10),
                             categoryarray = (1:10)[order(gimme)],
                             type = "category")) -> plt2
plt2

I noticed after I added the reorder, that the vertical lines were no longer visible. I created the lines again. (This is due to the changes in the y-axis.)

lines <- function(x = 0, dash = "solid", color = "black"){
  list(type = "line",
       x0 = x, x1 = x, y0 = 0, y1 = 1, 
       xref = "x", yref = "paper",
       layer = "below",
       line = list(color = color,
                   dash = dash))
}
plt2$x$layout$shapes <- list(plt2$x$layout$shapes,
                             lines(17), 
                             lines(dash = "longdash",
                                   color = "gray"))

Now for the legend names. There are a lot of ways to do this. I noticed that you had it right, added a layer, and then had a problem. Instead of picking through, you could use regex. Another option is to inspect to find the traces to change.

# change the applicable traces with conditions and regex
# fixes legend regardless of where or what order the traces fall in
invisible(
  lapply(1:length(plt2$x$data),
          function(j) {
            i = plt2$x$data[[j]]$name
            if(!is.null(i)){
              i = sub(".([[:alpha:]]+).*", "\\1", i)
              plt2$x$data[[j]]$name <<- 
                plt2$x$data[[j]]$legendgroup <<- i
            }
          })
  )

# if you just wanted to investigate
invisible(
  lapply(1:length(plt2$x$data),
         function(k) {
           message(k, ' ', plt2$x$data[[k]]$name) 
         }
))

Now you just have to call the plot.

plt2

enter image description here

Kat
  • 15,669
  • 3
  • 18
  • 51