1

Pandas has a built in implementation of matplotlib's histogram method for use with dataframes. I want to be able to get the values of each bin. I am able to do this by converting my dataframe column into a list with this code:

import pandas as pd
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt

df = pd.DataFrame(np.random.rand(100))
df.columns = ['mydata']

myarray = df.mydata[(df.mydata > 0)].tolist()  # This is a list
weights = np.ones_like(myarray)/len(myarray)
hist = plt.hist(myarray, bins=10, weights=weights)
plt.show()

counts, bins, bars = hist
print counts

But when I do this with pandas' implementation of hist, as follows:

import pandas as pd
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt

df = pd.DataFrame(np.random.rand(100))
df.columns = ['mydata']

myarray = df.mydata[(df.mydata > 0)]  # This is a pandas Series
weights = np.ones_like(myarray)/len(myarray)
hist = myarray.hist(bins=10, weights=weights)
plt.show()

counts, bins, bars = hist
print counts

I get this:

TypeError: 'AxesSubplot' object is not iterable

Does anyone know a way to get the histogram values off a histogram generated by pandas' hist method? Thanks!

lukewitmer
  • 1,153
  • 3
  • 11
  • 21
  • 2
    The `pandas` implementation of `hist` does not return the counts, bins and patches that you get from `matplotlib`'s version. Instead, it returns the `Axes` that it plotted the histogram on. If you check the duplicate I linked above, I show a way you can back out the `counts` and `bins` from the `patches` associated with the `Axes` returned by `pandas`; but its probably easier to just use `plt.hist`. – tmdavison Dec 08 '15 at 16:46
  • Good find tom. Thank you! – lukewitmer Dec 08 '15 at 16:47

0 Answers0