1

I'm trying to plot a network (a protein-protein interaction graph as an R igraph object), which is moderately large in size (111 nodes with 929 edges), so I'm trying to find a reasonable way of labeling the nodes, which are either the protein IDs or the names of their encoding genes, where in either case these labels are several characters long.

Here's the code to obtain the protein-protein interaction 'igraph' object:

library(STRINGdb)
string.db <- STRINGdb$new(version="11.5",species=9606,score_threshold=200,input_directory="")
string.graph <- string.db$get_graph()
igraph::V(string.graph)$name <- gsub("^9606\\.","",igraph::V(string.graph)$name)

string.graph is the entire graph and below I subset it to keep only the nodes connected to a pair of proteins of interest:

protein1 <- "ENSP00000475261"
protein2 <- "ENSP00000360829"
neighbors1 <- igraph::neighbors(graph = string.graph, v = protein1, mode = "all")
neighbors1 <- neighbors1[which(!is.na(names(neighbors1)))]
neighbors2 <- igraph::neighbors(graph = string.graph, v = protein2, mode = "all")
neighbors2 <- neighbors2[which(!is.na(names(neighbors2)))]
all.neighbors <- unique(c(neighbors1, neighbors2))

string.subgraph.vs <- igraph::V(string.graph)[name %in% c(names(all.neighbors), protein1, protein2)]
string.subgraph.vs.list <- igraph::ego(string.graph, order = 1, nodes = string.subgraph.vs, mode = "all", mindist = 0)
string.subgraph <- igraph::induced_subgraph(string.graph, unlist(string.subgraph.vs.list))

nodes.df <- data.frame(id = names(igraph::V(string.subgraph))) %>%
  unique() %>% dplyr::filter(id %in% c(names(all.neighbors), protein1, protein2)) %>%
  dplyr::mutate(size = 1, color = NA)
nodes.df$color[which(nodes.df$id %in% names(neighbors1) & !(nodes.df$id %in% names(neighbors2)))] <- "cornflowerblue"
nodes.df$color[which(nodes.df$id %in% names(neighbors2) & !(nodes.df$id %in% names(neighbors1)))] <- "#ee2400"
nodes.df$color[which(nodes.df$id %in% names(neighbors1) & nodes.df$id %in% names(neighbors2))] <- "darkorchid1"
nodes.df$color[which(nodes.df$id == protein1)] <- "cornflowerblue"
nodes.df$color[which(nodes.df$id == protein2)] <- "#ee2400"
nodes.df$size[which(nodes.df$id == protein1)] <- 3
nodes.df$size[which(nodes.df$id == protein2)] <- 3

edges.df <- as.data.frame(igraph::get.edgelist(string.subgraph)) %>%
  dplyr::rename(from = V1, to = V2) %>% unique() %>%
  dplyr::filter(from %in% c(names(all.neighbors), protein1, protein2) & to %in% c(names(all.neighbors), protein1, protein2))

new.string.subgraph <- igraph::graph_from_data_frame(d = edges.df, directed = F, vertices = nodes.df)

A bit cumbersome but I followed this blog for subsetting an igraph object.

If I plot it, using R's ggplot2 and ggnetwork packages, adding the node labels:

ggplot(new.string.subgraph, aes(x = x, y = y, xend = xend, yend = yend)) +
  ggnetwork::geom_edges(color = "grey50", alpha = 0.1) +
  ggnetwork::geom_nodes(color = igraph::vertex_attr(new.string.subgraph)$color, size = igraph::vertex_attr(new.string.subgraph)$size) +
  ggnetwork::geom_nodetext(aes(label = name)) +
  ggnetwork::theme_blank()

The outcome is quite crowded with the node labels: enter image description here

Relative to what it looks like without the node labels:

ggplot(new.string.subgraph, aes(x = x, y = y, xend = xend, yend = yend)) +
  ggnetwork::geom_edges(color = "grey50", alpha = 0.1) +
  ggnetwork::geom_nodes(color = igraph::vertex_attr(new.string.subgraph)$color, size = igraph::vertex_attr(new.string.subgraph)$size) +
  ggnetwork::theme_blank()

enter image description here

BTW, I'm hoping that the set.seed(1) indeed guarantees that the plot is reproducible.

So, I thought perhaps I'd use plotly (plotly::ggplotly()) to be able to see the node labels as a hover-over feature. But, all I'm seeing are the x and y coordinates as hover-over and if I assign label to the tooltip parameter of plotly::ggplotly I don't see any hover-over information at all.

Any idea how to get plotly to display the node labels without actually seeing them in the plot?

user1701545
  • 5,706
  • 14
  • 49
  • 80

1 Answers1

1

I wasn't able to load the STRINGdb package for the latest version of R (4.3.0). However, have you tried modifying your plotly object using plotly::style? I'm assuming it would look something like the following:

p <- ggplot(new.string.subgraph, aes(x = x, y = y, xend = xend, yend = yend)) +
  ggnetwork::geom_edges(color = "grey50", alpha = 0.1) +
  ggnetwork::geom_nodes(color = igraph::vertex_attr(new.string.subgraph)$color, size = igraph::vertex_attr(new.string.subgraph)$size) +
  ggnetwork::theme_blank()

p <- p %>% style(hoverinfo = "name")

ggplotly(p)

You may have to play around with the style arguments. See this plotly page for more details.

542goweast
  • 185
  • 12
  • Nice. Adding `label = name` to the aes of the `ggplot` call and then `style(hoverinfo = "label")` does add the nodes labels as hover-info, however it messes the label's order, meaning that hovering over a nodes show's another node's name. Any idea how to get that part right? I'm assuming it's a `ggnetwork` thing. – user1701545 May 11 '23 at 03:55
  • 1
    Hmmm, not sure what exactly is causing that. You can use the `igraph::set_vertex_attr()` with a `"label"=` argument early on in your code (and then modify the ggplot calls). [This](https://plotly.com/r/reference/#text) plotly page may be of some help, from `text` down to `yhoverformat`. These are options which you should be able to modify and use in your `style()` call. – 542goweast May 11 '23 at 18:34
  • 1
    Separately, [this](tinyurl.com/jd2x7v24) page describes a way to just use `ggnetwork::geom_nodetext_repel` to get non-overlapping labels, but I doubt this will work for your graph. It also mentions the addition of `geom_nodelabel` in replace of `geom_nodetext`; I haven't looked at what the differences in back-end calls are between the two functions. Finally, there is a chance that your ordering problem could be resolved by some `arrange` or `group_by` calls within the data itself, but I think you have a better bet of trying some other things first. – 542goweast May 11 '23 at 18:36