0

I am trying to make a correlation plot with plot_ly

As an example I do

library(plotly)

d <- diamonds[sample(nrow(diamonds), 1000), ]

p <- plot_ly(
  d, x = ~carat, y = ~price,
  # Hover text:
  text = ~paste("Price: ", price, '$<br>Cut:', cut),
  color = ~carat, size = ~carat
)

then plot

but how can I make the linear line and calculate the R2? is there a way to do it?

If you know any other way to do it, please let me know.

For example, something like this would be great

[![enter image description here][1]][1]

http://vault.hanover.edu/~altermattw/courses/220/R/corr/corr_2.html

I tried to do it with:

library(ggplot2)
ggplot(d, aes(x=carat, y=price)) +
        geom_point(aes(colour = Outcome)) +     
        geom_smooth(method=lm) 

which I am getting an error.

MLavoie
  • 9,671
  • 41
  • 36
  • 56
nik
  • 2,500
  • 5
  • 21
  • 48

1 Answers1

3

You could try this:

fit <- lm(price ~ carat-1, data = d)
summary(fit)$adj.r.squared

a <- list(
    x = 2,
    y = 5000,
    text = "R2 = 0.88",
    xref = "x",
    yref = "y",
    showarrow = FALSE,
    arrowhead = 7
)


plot_ly() %>% add_markers(data = d, x= ~ carat, y = ~ price, color = ~carat, size = ~carat, name = "Size", marker=list(colorbar=list(title='Colorbar'))) %>%  
    add_lines(x = ~carat, y = fitted(fit), name = "Regression line") %>% 
    layout(annotations = a)

enter image description here

MLavoie
  • 9,671
  • 41
  • 36
  • 56
  • Warning messages: 1: Numeric color variables cannot (yet) be mapped to lines. when the trace type is 'scatter' or 'scattergl'. 2: plotly.js doesn't yet support line.width arrays, track this issue for progress – nik Feb 03 '18 at 17:14
  • i see your point. is there any possibility to discard the extrapolation? the extreme values? i will accept your answer anyway if you know or not – nik Feb 03 '18 at 17:21
  • see edits. You can create a new data set and subset for values above 0 – MLavoie Feb 03 '18 at 17:26
  • no no, look at your line, it goes with an intercept. i want to pass through 0 , no intercept and calculate the correlation , do you know what i mean? – nik Feb 03 '18 at 17:29
  • I liked it and accepted it. however, there are two problems with it . add the ) after the fit. also can you tell me how to add the R2 in a specific place? also is there any possibility to change the lenegd name ? – nik Feb 03 '18 at 17:46
  • see edits; as for the R2, please look at **a**, you can see you define the location with x and y. – MLavoie Feb 03 '18 at 17:56
  • do you know how to remove the size and Regression line under the legend ? – nik Feb 03 '18 at 20:10
  • that will off all legend , I only want to not have size and Regression line – nik Feb 03 '18 at 20:18
  • 1
    this work for me: plot_ly() %>% add_markers(data = d, x= ~ carat, y = ~ price, color = ~carat, size = ~carat, name = "Size", marker=list(colorbar=list(title='Colorbar'))) %>% add_lines(x = ~carat, y = fitted(fit), name = "Regression line", showlegend=FALSE) %>% layout(annotations = a) – MLavoie Feb 03 '18 at 20:20