Pandas has a built in implementation of matplotlib's histogram method for use with dataframes. I want to be able to get the values of each bin. I am able to do this by converting my dataframe column into a list with this code:
import pandas as pd
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt
df = pd.DataFrame(np.random.rand(100))
df.columns = ['mydata']
myarray = df.mydata[(df.mydata > 0)].tolist() # This is a list
weights = np.ones_like(myarray)/len(myarray)
hist = plt.hist(myarray, bins=10, weights=weights)
plt.show()
counts, bins, bars = hist
print counts
But when I do this with pandas' implementation of hist, as follows:
import pandas as pd
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt
df = pd.DataFrame(np.random.rand(100))
df.columns = ['mydata']
myarray = df.mydata[(df.mydata > 0)] # This is a pandas Series
weights = np.ones_like(myarray)/len(myarray)
hist = myarray.hist(bins=10, weights=weights)
plt.show()
counts, bins, bars = hist
print counts
I get this:
TypeError: 'AxesSubplot' object is not iterable
Does anyone know a way to get the histogram values off a histogram generated by pandas' hist method? Thanks!