9

I'd like to get values from matplotlib.axes.AxesSubplot which is returned from pandas.Series.hist method. Is there any method to do so? I couldn't find the attribute in the list.

import pandas as pd
import matplotlib.pyplot as plt

serie = pd.Series([0.0,950.0,-70.0,812.0,0.0,-90.0,0.0,0.0,-90.0,0.0,-64.0,208.0,0.0,-90.0,0.0,-80.0,0.0,0.0,-80.0,-48.0,840.0,-100.0,190.0,130.0,-100.0,-100.0,0.0,-50.0,0.0,-100.0,-100.0,0.0,-90.0,0.0,-90.0,-90.0,63.0,-90.0,0.0,0.0,-90.0,-80.0,0.0,])
hist = serie.hist()
# I want to get values of hist variable.

I know I can get histogram values with np.histogram, but I want to use pandas hist method.

fx-kirin
  • 1,906
  • 1
  • 20
  • 33
  • 1
    I'm not sure this is possible: the Pandas [plotting.py source](https://github.com/pydata/pandas/blob/master/pandas/tools/plotting.py) seems to throw away the binned data, bin edges and patch objects that matplotlib returns to it. Why not plot directly with `plt.hist`? – xnx Nov 24 '15 at 09:18

1 Answers1

13

As xnx pointed out in comments, this isn't as easily accessible as if you used plt.hist. However, if you really want to use the pandas hist function, you can get this information, from the patches that are added to the hist AxesSubplot when you call serie.hist.

Here's a function to loop through the patches, and return the bin edges and histogram counts:

import pandas as pd
import matplotlib.pyplot as plt

serie = pd.Series([0.0,950.0,-70.0,812.0,0.0,-90.0,0.0,0.0,-90.0,0.0,-64.0,208.0,0.0,-90.0,0.0,-80.0,0.0,0.0,-80.0,-48.0,840.0,-100.0,190.0,130.0,-100.0,-100.0,0.0,-50.0,0.0,-100.0,-100.0,0.0,-90.0,0.0,-90.0,-90.0,63.0,-90.0,0.0,0.0,-90.0,-80.0,0.0,])
hist = serie.hist()

def get_hist(ax):
    n,bins = [],[]
    for rect in ax.patches:
        ((x0, y0), (x1, y1)) = rect.get_bbox().get_points()
        n.append(y1-y0)
        bins.append(x0) # left edge of each bin
    bins.append(x1) # also get right edge of last bin

    return n,bins

n, bins = get_hist(hist)

print n
print bins

plt.show()

Here's the output of n and bins:

[36.0, 1.0, 3.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.0, 1.0]                          # n
[-100.0, 5.0, 110.0, 215.0, 320.0, 425.0, 530.0, 635.0, 740.0, 845.0, 950.0] # bins

And here's the histogram plot to check:

enter image description here

tmdavison
  • 64,360
  • 12
  • 187
  • 165