1

I am trying to take a blank plot, feed it into onRender() from htmlWidgets, and add many lines inside the onRender() function. In the code below, I use a dataset with 100 rows (100 lines), and when I run the application, the 100 lines are drawn inside onRender() in about one second. However, when I change the dataset to have, say, 2000 lines, it will take ten seconds to draw them all.

I am trying to achieve this for datasets on the order of 50,000 to 100,000 lines. This obviously is problematic due to the slowness of the code currently!

The way I am currently achieving the functionality is by:

  1. Creating a data frame in R called pcpDat. It has 100 rows and 6 columns of numeric data.
  2. Creating a blank plot in R called p
  3. Feeding data frame pcpDat and plot p into onRender()
  4. In onRender(): I have an xArr object that just contains the values 0,1,2,3,4,5. For each row of the data frame, I reconstruct its 6 values into a numeric vector called yArr. Then, for each row of the data, I create a Plotly trace object that contains xArr and yArr to be plotted for the 6 x and 6 y values. This Plotly trace object then creates one orange line for each row of the original data frame.

It may seem silly to have so many lines plotted! My reasoning is I am trying to eventually add functionality so a user can use Plotly to select an area on the plot and view only the lines that intercept that area (the rest of the lines will be deleted). This is why I want the lines to be "interactive".

This all made me ponder a few questions:

  1. I am not experienced with JavaScript (which is the crux of the onRender() function). I am wondering if it is even possible to expect 50,000 to 100,000 lines to be plotted quickly (within say 5 seconds)?
  2. If the answer to (1) is that it should be possible, I am seeking advice on how I can "speed up" my code snippet below. Without much JavaScript skills, it is difficult for me to determine what is costing the most time. I could be reconstructing that data inefficiently.

I am eager to hear any advice or opinions on this topic. Thank you!

library(plotly)
library(ggplot2)
library(shiny)
library(htmlwidgets)
library(utils)

ui <- basicPage(

  plotlyOutput("plot1")
)

server <- function(input, output) {

  set.seed(3)
  f = function(){1.3*rnorm(100)}
  pcpDat = data.frame(ID = paste0("ID", 1:100), A=f(), B=f(), C=f(), D=f(), E=f(), F=f())
  pcpDat$ID = as.character(pcpDat$ID)
  plotPCP(pcpDat = pcpDat)

  colNms <- colnames(pcpDat[, c(2:(ncol(pcpDat)))])
  nVar <- length(colNms)

  p <- ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point(alpha=0) + xlim(0,(nVar-1)) +ylim(min(pcpDat[,2:(nVar+1)]),max(pcpDat[,2:(nVar+1)])) + xlab("Sample") + ylab("Count")
  gp <- ggplotly(p)

  output$plot1 <- renderPlotly({
    gp %>% onRender("
      function(el, x, data) {

      var origPcpDat = data.pcpDat
      var pcpDat = data.pcpDat

      var Traces = [];
      var dLength = pcpDat.length
      var vLength = data.nVar
      var cNames = data.colNms

      xArr = [];
      for (b=0; b<vLength; b++){
      xArr.push(b)
      }

      for (a=0; a<dLength; a++){
      yArr = [];
      for (b=0; b<vLength; b++){
      yArr.push(pcpDat[a][cNames[b]]);
      }
      var pcpLine = {
      x: xArr,
      y: yArr,
      mode: 'lines',
      line: {
      color: 'orange',
      width: 1
      },
      opacity: 0.9,
      }
      Traces.push(pcpLine);
      }
      Plotly.addTraces(el.id, Traces);
}", data = list(pcpDat = pcpDat, nVar = nVar, colNms = colNms))})
}

shinyApp(ui, server)

EDIT: To demonstrate what I am trying to do, I am including 3 images. They show an example where there are 10 rows (lines) in the data. The first image is what the user would see at first (all 10 lines present). Then, the user can use the "Box select" tool and to create a rectangle (gray). Any lines that stay inside the rectangle for all x values it contains remains. In the second image for this example, 5 lines remain. After that, the user can, say, create another rectangle (gray). Again, any lines that stay inside the rectangle for all x values it contains remains. In the third image for this example, only 1 of the lines now remains. These 3 screenshots are from my functioning code. So, I do have a prototype working. However, when I add thousands of lines, it is too slow.

enter image description here

enter image description here

enter image description here

  • 1
    Can you move to creating your plot in native `plotly`? Seems you are doing unnecessary acrobatics via `ggplot` and `onRender` to do what could be done using `plotly::plot_ly` – Kevin Arseneau Oct 08 '17 at 23:50
  • Thanks @KevinArseneau. In your opinion, in plotly::plot_ly, do you think this type of interactive graphic could be accomplished (50/100,000 lines plotted in >5seconds)? I am creating this software for people using R. So, the data fed into the interactive graphic (called pcpDat in the MWE) would be a R data frame. That is why I have been using onRender() and ggplot() - because it allows me to connect R and plotly. I agree it does seem like acrobatics as is. –  Oct 09 '17 at 01:10
  • at some point you will be hardware constrained so it is impossible to answer. However, if you intend on using `plotly` to display the plot, there will not be a faster way than using the native API. – Kevin Arseneau Oct 09 '17 at 01:20
  • Thank you, I will keep that in mind. However, I have to wonder if it would be possible to read R objects into native plotly. My intention is to allow R users to call a function from R that reads in their R data frames and produces the interactive visualization. I am not sure if that would be possible with native plotly. I would also be open to learning other interactive graphical tools that can read in R objects. –  Oct 09 '17 at 01:38
  • that is exactly what the `plotly` package does. Just don't use `ggplotly` to convert a `ggplot` object. See https://plot.ly/r/ – Kevin Arseneau Oct 09 '17 at 01:41

1 Answers1

0

If you translate your ggplot and plotly javascript to the plotly package standard then you will remove the extra steps and computation you currently have. Minimal example solution below:

output$plot1 <- renderPlotly({

  plot_ly(type = "scatter", mode = "markers") %>%
    add_trace(
      x = ~wt,
      y = ~mpg,
      data = mtcars
    ) %>%
    layout(
      xaxis = list(title = "Sample"),
      yaxis = list(title = "Count")
    )

})

enter image description here

To accomplish the hidden traces, you can set the visible = "legendonly" attrbute to your traces, and the user can switch those on or off. See these answers for more detail, 1 & 2

You can also use inputs and reactives to limit the amount of data you send to plotly instead of giving it everything each time you want to generate.

Kevin Arseneau
  • 6,186
  • 1
  • 21
  • 40
  • Thanks. I think I may have tried this approach before, but I needed to add a trace for each line - it would make the legend too long if each legend label corresponded to one line. I am interested in the user being able to "select an area on the plot and view only the lines that intercept that area". So, it is not so much just turning on and off the legend components. I am aiming to achieve a functionality where they really interact with the plot area itself. I added an EDIT to this post to show 3 example images of what I am trying to accomplish. Thank you again. –  Oct 09 '17 at 03:31
  • @luckButtered, you can turn off the legend completely and still use the approach in my solution – Kevin Arseneau Oct 09 '17 at 03:33