1

I'd like to display the prediction lines of a model on a ggvis plot, so I can dynamically change the scale on the x-axis.

I can plot the model predictions in ggplot easily enough:

enter image description here

But when I try to do it in ggvis, I get strange behaviours - I don't know how to tell ggvis to group by "pop" in the predicted dataframe. These are the graphs that I'm getting... I'm wondering if this is even possible currently? Just read on http://ggvis.rstudio.com/layers.html that "You can not currently set the component of lines to different colours: track progress at https://github.com/trifacta/vega/issues/122." hmmmm...

Reproducible example below.

enter image description here

enter image description here

enter image description here

library(nlme)
library(dplyr)
library(ggplot2)
library(ggvis)


dframe <- structure(list(pop = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label =
c("P1", "P2"), class = "factor"), id = structure(c(1L, 2L, 1L, 3L, 4L, 5L, 6L,
7L, 8L, 9L, 2L, 10L, 11L, 11L, 12L, 5L, 13L, 2L, 14L, 10L, 15L, 5L, 16L, 16L,
17L, 18L, 19L, 20L, 21L, 23L, 24L, 25L, 22L, 24L, 23L, 25L, 22L, 16L, 20L,
11L, 3L, 2L, 1L, 1L), .Label = c("A", "B", "C", "D", "E", "F", "G", "H", "I",
"J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y"
), class = "factor"), x = c(0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 9.5,
10.5, 0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 9.5, 10.5, 0.5, 1.5, 2.5,
3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 9.5, 10.5, 0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5,
7.5, 8.5, 9.5, 10.5), act = c(13.9, 47.8, 68.3, 80.8, 88.4, 92.9, 95.7, 97.4,
98.4, 99, 99.4, 13.9, 47.8, 68.3, 80.8, 88.4, 92.9, 95.7, 97.4, 98.4, 99,
99.4, 12.7, 35.2, 48.9, 57.2, 62.2, 65.3, 67.1, 68.3, 69, 69.4, 69.6, 12.7,
35.2, 48.9, 57.2, 62.2, 65.3, 67.1, 68.3, 69, 69.4, 69.6), y = c(15L, 46L, 
68L, 80L, 92L, 89L, 95L, 97L, 99L, 96L, 103L, 14L, 43L, 72L, 81L, 88L, 94L,
93L, 98L, 96L, 100L, 102L, 12L, 36L, 50L, 54L, 62L, 66L, 68L, 65L, 71L, 69L,
68L, 14L, 37L, 51L, 56L, 63L, 66L, 69L, 65L, 70L, 69L, 73L)), .Names =
c("pop", "id", "x", "act", "y"), class = "data.frame", row.names = c(NA, -44L 
))

LVB = function(t, Linf, K, t0) 
{
  if (length(Linf) == 3) {
    K <- Linf[[2]]
    t0 <- Linf[[3]]
    Linf <- Linf[[1]]
  }
  Linf*(1-exp(-K*(t-t0)))
}

# Fit a null model with random effects (not interested in them right now)
model <- nlme(y~LVB(x,Linf, K, t0),data=dframe,
              fixed = list(Linf~pop, K~1, t0~pop),
              random = Linf ~1|id,
              start  = list(fixed= c(80, 0,
                                     0.5,
                                     -0.2, 0)))

# Create data frame of predicted values
predframe <- with(dframe, expand.grid(x = seq(0.5, 11, 0.1), y = seq(min(y), max(y), 20), pop = unique(pop)))
predframe$fitted <- predict(model, level = 0, newdata = predframe)

# Graph with ggplot 
g <- ggplot(dframe, aes(x, y, color = pop))
g + geom_point() + 
  geom_line(data =predframe, aes(x=x, y=fitted, color= pop))

# This is plotting the model bits properly
ggvis(dframe, ~x, ~y, fill = ~pop) %>%
  layer_points(size := 30) %>%
  layer_points(data = predframe, y =~fitted, fill =~pop, size := 1)

# This is the best I can get
ggvis(dframe, ~x, ~y, fill = ~pop) %>%
  layer_points() %>%
  layer_paths(data = predframe, y =~fitted, fill := NA, stroke =~pop)

# Results in squiggles
predframe <- predframe[order(predframe$fitted),]
ggvis(dframe, ~x, ~y, fill = ~pop) %>%
  layer_points() %>%
  layer_paths(data = predframe, y =~fitted, fill := NA, stroke =~pop)

# More squiggles.
predframe <- predframe[order(predframe$x),]
ggvis(dframe, ~x, ~y, fill = ~pop) %>%
  layer_points() %>%
  layer_paths(data = predframe, y =~fitted, fill := NA, stroke =~pop)

EDIT

Think I found a solution: Change the order of the arguments to ggvis:

    ggvis(predframe, ~x, ~fitted, stroke = ~pop) %>%
  layer_lines() %>%
  layer_points(data = dframe, x=~x, y=~y, fill = ~pop) %>%
  scale_numeric('x', domain = input_slider(0, 11, c(0, 11)), clamp = T)

enter image description here

Nova
  • 5,423
  • 2
  • 42
  • 62
  • 1
    You could just add your predicted values to your original dataset and then plot using `layer_lines`, i.e., `dframe$pred = predict(model, level = 0)`. `layer_lines` doesn't currently have a data argument - you can use `layer_paths` instead but then you'll have to order the dataset by 'x' manually. – aosmith Apr 01 '16 at 16:25
  • Both solutions aren't working ideally for me in my "real" situation. layer_lines because my data are chunky, so in between chunks the model lines look strangely flat, and layer_paths - not sure why that one is being so fractious. – Nova Apr 01 '16 at 19:49
  • 1
    I believe the dataset needs to be ordered by the x variable, not by the fitted values (the help page for `layer_lines` clarifies this). – aosmith Apr 01 '16 at 20:07
  • Both ways make ugly lines... – Nova Apr 01 '16 at 20:32
  • Consider updating your example dataset so that it reproduces the problem you are having. – aosmith Apr 02 '16 at 17:02
  • Thanks aosmith for staying with me on this one, I've updated all the code to provide a better reproducible example. I found out ggvis isn't reading my "population" column as a grouping variable, that's what's resulting in the ugly lines. – Nova Apr 03 '16 at 02:28
  • 1
    Glad you figured something out; you should add a solution as an answer. For the `layer_paths` option you could have grouped the prediction dataset - `layer_paths(data = group_by(predframe, pop), y =~fitted, stroke =~pop)` – aosmith Apr 04 '16 at 16:27
  • Sure, I'll do that. I'd like to add your solution too, but if I try `ggvis(dframe, ~x, ~y, fill = ~pop, stroke = ~pop) %>% layer_points() %>% layer_paths(data = group_by(predframe, pop), y =~fitted, stroke =~pop, fill := NA) %>% scale_numeric('x', domain = input_slider(0, 11, c(0, 11)), clamp = T)`, I get the right colors but a graph similar to my first ggvis plot above. If you can figure out how to not "connect" the lines I'll add your suggestion to my solution! – Nova Apr 04 '16 at 17:15
  • 1
    Maybe you forgot to set the order of `predframe` before plotting? Your comment code works for me (using a sorted `predframe`). – aosmith Apr 04 '16 at 17:29
  • That's it, yes! I had forgotten you'd suggested that originally. Horay! – Nova Apr 04 '16 at 17:51

1 Answers1

1

With @aosmith's help (thanks!), and some tweaking, we came up with two solutions to this problem, I'm posting both solutions here - to see the solution graphed, look at the "edits" section of my original question.

First solution (you don't have to sort the input data frame, but you DO have to put the arguments in this order to ggvis):

ggvis(predframe, ~x, ~fitted, stroke = ~pop) %>%
  layer_lines() %>%
  layer_points(data = dframe, x=~x, y=~y, fill = ~pop) %>%
  scale_numeric('x', domain = input_slider(0, 11, c(0, 11)), clamp = T)

Second solution (you have to sort the predicted values data.frame first):

predframe <- predframe %>%
  arrange(x)
ggvis(dframe, ~x, ~y, fill = ~pop, stroke = ~pop) %>%
  layer_points() %>%
  layer_paths(data = group_by(predframe, pop), y =~fitted, stroke =~pop, fill := NA) %>%
  scale_numeric('x', domain = input_slider(0, 11, c(0, 11)), clamp = T)
Nova
  • 5,423
  • 2
  • 42
  • 62