1

I want to plot some financials data which have very wide ranges. At first I used linear axis, however due to the extreme ranges in both x and y axis...the plot end up unusable. I know there are outliers but I don't want to exclude them from the chart.

Linear chart

Hence I'm using log scale for both x and y axis. The log scale plot was successfully created, however it shows only the positive data...all the negative data is gone from the plot. Then I did a bit of searching around and I found in Bokeh github about forcing log axes to remain positive: https://github.com/bokeh/bokeh/issues/5550

Log chart

With the info from github, is it really impossible to create a log scale which consists of negative values? What I want is the chart to be able to extend both x and y axis to negative values and be able to show the full data (hence no need to exclude the outliers).

Here is the code I have written:

p = figure(
x_axis_type = 'log',
y_axis_type = 'log',
height = 600,
sizing_mode = "stretch_width",
tools = TOOLS, 
tooltips = TOOLTIPS,
toolbar_location = "above"
)

p.xaxis.major_label_orientation = 3.14 / 4
p.xaxis[0].formatter = NumeralTickFormatter(format="0")
p.yaxis[0].formatter = NumeralTickFormatter(format="0")
p.grid.grid_line_alpha = 0.3

low_x = fundamental['Profit [%]'].min()
high_x = fundamental['Profit [%]'].max()
low_y = fundamental['Profit Growth [%]'].min()
high_y = fundamental['Profit Growth [%]'].max()

p.line(x = (low_x, 0), y = (0, 0), color = 'red', line_width = 2, line_dash = 'dashed')
p.line(x = (0, 0), y = (low_y, 0), color = 'red', line_width = 2, line_dash = 'dashed')
p.line(x = (0, high_x), y = (0, 0), color = 'green', line_width = 2, line_dash = 'dashed')
p.line(x = (0, 0), y = (0, high_y), color = 'green', line_width = 2, line_dash = 'dashed')

show(p)

The data can be just random floats as long as it contains negative values. I'm using Python 3.8.13 and Bokeh 2.4.3 on a Windows 10 machine. Cheers!

  • I think you are looking for an equivilant of the [broken axis](https://matplotlib.org/stable/gallery/subplots_axes_and_figures/broken_axis.html). As fas as I know this is not implemented in bokeh. But the log axis will not work for you example, because the log never gets zero. Check the [log example](http://docs.bokeh.org/en/latest/docs/gallery/logplot.html) to see my point. – mosc9575 Aug 25 '22 at 12:55
  • yes I understand the point about log never gets to zero. Just in case it can show negative numbers, then outliers data like in my example can be included inside the graph without making the chart unusable (please refer to the comparison linear chart vs log chart). – Rakshasha Medhi Aug 28 '22 at 01:37
  • And these outliers don't need to be excluded from the chart...which can be useful like showing which companies are having a major positive turnaround in their financials, i.e: last year net profit was merely $100, while this year net profit become $1mil --> this alone means there's a 10,000x increase in net profit and normally would be considered an outlier in the data. But that's totally wrong..if we exclude this outlier we'll be missing out what would be a great investment opportunity. – Rakshasha Medhi Aug 28 '22 at 01:38

1 Answers1

0

Mathematically, the log function is undefined ("infinite") at zero, and complex-valued for negative real numbers. So in the pure sense, a negative log axis is an impossibility. Some libraries (e.g. MPL I think) have implemented a symlog ("symmetric log") axis option that linearizes the scale in some interval around zero, and uses magnitudes for negative values, and stitches the ranges together so that things are "well defined". However, this approach is better suited for static plots that don't support panning and zooming. It would be a non-trivial amount of re-work to add it to Bokeh, and there has never been much demand for it, so no-one has ever decided to spend time on it.

Some other options:

  • If there are no zero values, plot using the absolute values of the coordinates to keep everything in the positive quadrant. Then you could use a log-log scale. You could distinguish "flipped" values by color or marker shape need be.
  • Alternatively, plot the separate quadrants in four plots in a grid plot, again using absolute values to keep values positive in each plot so that a log-log scale is possible. You could flip axes as appropriate if desired.
  • Lastly, use a linear scale, but omit the outliers. Find some other way to show outliers, e.g. in a a DataTable next to the plot.
bigreddot
  • 33,642
  • 5
  • 69
  • 122
  • Thanks! Yeah I've been thinking about doing options #3 (separate the outliers in a table). I hesitated because I think it's much better if it can be done in 1 chart only...but if there's no other way then it's fine I suppose. Cheers! – Rakshasha Medhi Aug 27 '22 at 02:46