1

I am plotting two different types of data in r on the same x axis with the plot_ly library. I have two independent y axes. The data here is a quick mockup to represent the dataset with which I am working. I have a family of explanatory variables (i.e. actuals 1-10), their coefficients, and the normalized amount of times they appeared in a fictitious model, and NA values to pad the vectors in order to make data-frames:

require(tidyverse)
require(data.table)
require(stringr)
require(plotly)
require(processx)

Actuals <-c(1,2,3,4,5,6,7,8,9,10)       
set.seed(10)
UniqCoef1 <- runif(10, min=-1, max=1)    
NrmPolct1 <- c(.015,.005,.33,.32,.225,.025,.03,.05,NA ,NA)
dataframe1 <- data.frame(NrmPolct1,UniqCoef1,Actuals)

In an effort to fix the ranges of their respective y and y2 axis, I built scale factors based on the max value of the coefficients:

sclfctr1    <- ((.33*1.02)/.20)

Then plotted the data:

plot1 <- plot_ly(dataframe1) %>%
            add_trace(x=~Actuals, y=~UniqCoef1, type="scatter", mode="markers", name="Coef Values") %>%
            add_bars(x=~Actuals, y=~NrmPolct1, yaxis="y2", name="% Polcy Count") %>%
            layout(plot_bgcolor='#D0CFC9',
                    yaxis=list(scaleanchor="x", scaleratio=5,title="Coefficients", zeroline = FALSE,
                                range=c((((max(UniqCoef1)*1.02)-(max(UniqCoef1)*1.02)+(min(UniqCoef1)*.98))/.80),max(UniqCoef1)*1.02), 
                                tickvals=UniqCoef1,standoff=30),
                    yaxis2=list(scaleacnchor="y", scaleratio=.15,overlaying="y",side="right", tickvals=NrmPolct1,range=c(0,sclfctr1),title="% Polcy Count", standoff=30),
                    xaxis=list(tickvals=Actuals,title=paste("actuals"),showgrid=T, standoff=30),
                    title=paste0("# Actuals for State 23, Crop 41."))

plot1

The method I am using now works generally to keep the scatter plot from being occluded by the bars, but I am hoping to find a way to constrain y1 so it utilizes only the top 80% or so of the plot area and y2 the bottom 20% or so.

Is this possible with plot_ly in r? So far, neither my utilization of scaleanchor and scaleratio nor the range() arguments in the layout section have been effective.

Update: I messed around with the base argument in the add_bars() section with the same results.

Update: no luck with domain either. It is possible I am not using the tools in the library correctly but I am not sure what else to try.

Update 12/2/21: Testing out new range calculations. Might have found a jerry-rigged approach.

PDiddyA
  • 59
  • 1
  • 8
  • 1
    have you looked at `ggplotly`? There might be some options there – neuron Dec 02 '21 at 03:00
  • 1
    The main issue with plotly is that it heavily lacks tunability especially with things like this – neuron Dec 02 '21 at 03:02
  • @neuron, I originally tried this with ggplot and plot_ly ended up being much easier so I did not consider ggplotly. I will give this a look and see what I can find. – PDiddyA Dec 02 '21 at 03:31
  • Well, in ggplot when working with secondary axis, you have actually to scale the data, not the axis. Maybe the same approach? – Alberson Miranda Dec 02 '21 at 04:03
  • @AlbersonMiranda I need the data to retain its original values. Iirc, scaling the data changes those right? – PDiddyA Dec 02 '21 at 04:19
  • 1
    @JDA95 Doesn't seem like `ggplotly` will work. I tried making a plot with `ggplot` and used `ggplotly` and the second axis was removed – neuron Dec 02 '21 at 06:02
  • @neuron thanks for the heads up. I hadn't a chance to start looking so that saves me quite a bit of time. I might just have to run with things the way they are and rely on the interactive plots to get around the occlusion. – PDiddyA Dec 02 '21 at 14:49

1 Answers1

1

In short, the answer so far is no. Thanks to @neuron for their work going through ggplotly and ruling that approach out.

However, with some better range calculations and replacing the NA values with 0, it is possible to use the range=c() argument in layout() to ensure all the scatter points are not occluded by the bars. I tested this on the data set above as well as my actual data and it worked in all 696 cases.

sclfctr1    <- ((max(NrmPolct1)*1.02)/.18)

It is important to reference the vector containing the bar data here for proper scaling.

Then, create vectors for y axis data range, y axis total range, padding for the axis (this will need to be tweaked for other data sets), and the max and minimum values for the axis:

 leftAxisDataRange =  (max(UniqCoef1)) - (min(UniqCoef1))

leftAxisTotalRange = (leftAxisDataRange) / 0.78

leftAxisPadding = leftAxisTotalRange * 0.01

leftAxisMax = max(UniqCoef1) + leftAxisPadding

leftAxisMin = (leftAxisMax - leftAxisTotalRange) - leftAxisPadding

Finally, simply replace the range calculation from above with the more aesthetically pleasing vectors just created:

plot_ly(dataframe1) %>%
            add_trace(x=~Actuals, y=~UniqCoef1, yaxis="y", type="scatter", mode="markers", name="Coef Values") %>%
            add_bars(x=~Actuals, y=~NrmPolct1, yaxis="y2", name="% Polcy Count") %>%
            layout(plot_bgcolor='#D0CFC9',
                    yaxis=list(side="left",scaleanchor="x", scaleratio=sclfctr1,title="Coefficients", zeroline = FALSE,
                                tickvals=UniqCoef1,standoff=30,
                                range=c(leftAxisMin,leftAxisMax)),
                    yaxis2=list(scaleacnchor="y", scaleratio=.15,overlaying="y",side="right", tickvals=NrmPolct1,
                                range=c(0,sclfctr1),title="% Polcy Count", standoff=30),
                    xaxis=list(tickvals=Actuals,title=paste("actuals"),showgrid=T, standoff=30),
                    title=paste0("# Test Actuals for State 23, Crop 41."))

With UniqCoef1 <- runif(10, min=-2, max=2) and NrmPolct1 <- c(.015,.005,.66,.32,.225,.025,.03,.05,0 ,0)(to ensure occlusion if the range is incorrect) we get enter image description here

PDiddyA
  • 59
  • 1
  • 8