1

I have a matplotlib bar chart that uses yerr to simulate a box plot.

I would like to

  1. click on this bar chart
  2. get the y value for this click
  3. draw a red horizontal line at this y value
  4. run a t-test of bar chart data vs y value using scipy.stats.ttest_1samp
  5. update bar chart colors (blue if t << -2 and red if t >> 2)

I can do each of these steps separately, but not together.

I don't know how to feed the y value back to run the t-test and update the chart. I can feed a y value on first run and correctly color the bar charts, but I can't update the bar charts with the click y value.

Here are some toy data.

import pandas as pd
import numpy as np

np.random.seed(12345)

df = pd.DataFrame([np.random.normal(32000,200000,3650), 
                   np.random.normal(43000,100000,3650), 
                   np.random.normal(43500,140000,3650), 
                   np.random.normal(48000,70000,3650)], 
                  index=[1992,1993,1994,1995])

And here is what I have pieced together to draw the chart and add the line. I would also like to add an inset that maps colors to t statistics, but I think that is separate from updating the bar chart and I can add that on my own.

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

class PointPicker(object):
    def __init__(self, df, y=0):

        # moments for bar chart "box plot"
        mus = df.mean(axis=1)
        sigmas = df.std(axis=1)
        obs = df.count(axis=1)
        ses = sigmas / np.sqrt(obs - 1)
        err = 1.96 * ses
        Nvars = len(df)

        # map t-ststistics to colors
        ttests = ttest_1samp(df.transpose(), y)
        RdBus = plt.get_cmap('RdBu')
        colors = RdBus(1 / (1 + np.exp(ttests.statistic)))

        self.fig = plt.figure()
        self.ax = self.fig.add_subplot(111)

        # bar chart "box plot"
        self.ax.bar(list(range(Nvars)), mus, yerr=ci, capsize=20, picker=5, color=colors)
        plt.xticks(list(range(Nvars)), df.index)
        plt.tick_params(top='off', bottom='off', left='off', right='off', labelleft='on', labelbottom='on')
        plt.gca().get_yaxis().set_major_formatter(matplotlib.ticker.FuncFormatter(lambda x, p: format(int(x), ',')))
        plt.title('Random Data for 1992 to 1995')

        self.fig.canvas.mpl_connect('pick_event', self.onpick)
        self.fig.canvas.mpl_connect('key_press_event', self.onpress)

    def onpress(self, event):
        """define some key press events"""
        if event.key.lower() == 'q':
            sys.exit()

    def onpick(self,event):
        x = event.mouseevent.xdata
        y = event.mouseevent.ydata
        self.ax.axhline(y=y, color='red')
        self.fig.canvas.draw()

if __name__ == '__main__':

    plt.ion()
    p = PointPicker(df, y=32000)
    plt.show()

After I click, the horizontal line appears, but the bar chart colors do not update.

enter image description here

Bonifacio2
  • 3,405
  • 6
  • 34
  • 54
Richard Herron
  • 9,760
  • 12
  • 69
  • 116
  • 1
    Can you please check if this hasn't already been answered in this question: https://stackoverflow.com/questions/43133017/how-to-change-colors-automatically-once-a-parameter-is-changed At least it's pretty simimlar, down to using the exact same data and it updates the color as desired. As for the statistics, you may then edit your question to ask more specifically. – ImportanceOfBeingErnest Jul 17 '17 at 16:42
  • @ImportanceOfBeingErnest yes, this achieves the same big picture goal. I can adapt this to do what I want (with confidence intervals, *t*-tests, etc). I am still interested in how to pass back the click data... although your linked solution is a better option. – Richard Herron Jul 17 '17 at 16:54
  • 1
    I'm not sure if I understand what you mean by "pass back the click data". Does the answer below solve that problem? If so, you can accept it, if not, I would suggest you go into much more detail about your requirement and clearly state in how far the linked question's answer as well as the answer below do not help you. – ImportanceOfBeingErnest Jul 17 '17 at 18:05
  • @ImportanceOfBeingErnest Tom's answer shows how to use the y value to update the chart. The linked answer also helps. Thanks. – Richard Herron Jul 17 '17 at 19:37

1 Answers1

2

You want to recalculate the ttests using the new y value inside onpick. Then, you can recalculate the colors in the same way as you did before. You can then loop over the bars created with ax.bar (here I save them as self.bars for easy access), and use bar.set_facecolor with the newly calculated color.

I also added a try, except construct to change the yvalue of the line if you click a second time, rather than create a new line.

import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
from scipy.stats import ttest_1samp

np.random.seed(12345)

df = pd.DataFrame([np.random.normal(32000,200000,3650), 
                   np.random.normal(43000,100000,3650), 
                   np.random.normal(43500,140000,3650), 
                   np.random.normal(48000,70000,3650)], 
                  index=[1992,1993,1994,1995])


class PointPicker(object):
    def __init__(self, df, y=0):

        # Store reference to the dataframe for access later
        self.df = df

        # moments for bar chart "box plot"
        mus = df.mean(axis=1)
        sigmas = df.std(axis=1)
        obs = df.count(axis=1)
        ses = sigmas / np.sqrt(obs - 1)
        err = 1.96 * ses
        Nvars = len(df)

        # map t-ststistics to colors
        ttests = ttest_1samp(df.transpose(), y)
        RdBus = plt.get_cmap('RdBu')
        colors = RdBus(1 / (1 + np.exp(ttests.statistic)))

        self.fig = plt.figure()
        self.ax = self.fig.add_subplot(111)

        # bar chart "box plot". Store reference to the bars here for access later
        self.bars = self.ax.bar(
                list(range(Nvars)), mus, yerr=ses, capsize=20, picker=5, color=colors)
        plt.xticks(list(range(Nvars)), df.index)
        plt.tick_params(top='off', bottom='off', left='off', right='off', labelleft='on', labelbottom='on')
        plt.gca().get_yaxis().set_major_formatter(matplotlib.ticker.FuncFormatter(lambda x, p: format(int(x), ',')))
        plt.title('Random Data for 1992 to 1995')

        self.fig.canvas.mpl_connect('pick_event', self.onpick)
        self.fig.canvas.mpl_connect('key_press_event', self.onpress)

    def onpress(self, event):
        """define some key press events"""
        if event.key.lower() == 'q':
            sys.exit()

    def onpick(self,event):
        x = event.mouseevent.xdata
        y = event.mouseevent.ydata

        # If a line already exists, just update its y value, else create a horizontal line
        try:
            self.line.set_ydata(y)
        except:
            self.line = self.ax.axhline(y=y, color='red')

        # Recalculate the ttest
        newttests = ttest_1samp(df.transpose(), y)
        RdBus = plt.get_cmap('RdBu')
        # Recalculate the colors
        newcolors = RdBus(1 / (1 + np.exp(newttests.statistic)))

        # Loop over bars and update their colors
        for bar, col in zip(self.bars, newcolors):
            bar.set_facecolor(col)

        self.fig.canvas.draw()

if __name__ == '__main__':

    #plt.ion()
    p = PointPicker(df, y=32000)
    plt.show()

Here's some example output:

enter image description here

enter image description here

enter image description here

tmdavison
  • 64,360
  • 12
  • 187
  • 165