I'm trying to make a function which produces attractive-looking scatter plots. I have two conflicting desires:
- Individual, separate data points are visible;
- Multiple data points that are close together, such that their dots overlap, should darken.
I'm currently accomplishing the through usage of the alpha channel. The former, I accomplish by including a copy of the scatter plot which does not have an alpha channel:
import numpy as np
import matplotlib.pyplot as plt
N = 100000
fig = plt.figure()
fig.set_facecolor("white")
x = np.random.randn(N)
y = np.random.randn(N)
base_colour = (0.25, 0.4, 0.6)
adj_colour = tuple(0.75 + 0.25*x for x in base_colour)
plt.scatter(x, y, color=adj_colour, linewidth=0)
plt.scatter(x, y, color=base_colour, alpha=0.05, linewidth=0)
This produces a picture such as the following (depending on your RNG):
Note how the "outliers" are individually visible, but the centre is darker than the outer edges, implying that data points are more densely distributed there.
Note also, however, that most of the central area is all the same shade of blue: the alpha is high enough that many different overlapping points together all have an alpha of approximately 1. (In fact, it's so close to 1 that each pixel in the middle is the exact same shade of blue.) Due to the way "overlapping alpha channels" work, the amount of "white" in each pixel is exponentially decaying in the density of the points.
I could use a lower alpha. However, that wouldn't look as nice for graphs which have significantly fewer data points, or areas that are less densely populated. Is there any way around this, or am I going to have to make the user of my function type in an alpha value that works nice for them?
Otherwise, is there a way to accomplish what I'm doing without making two scatter plots in the same figure?