Histogram has only one bar

Question

My data--a 196,585-record numpy array extracted from a pandas dataframe--are being placed into a single bin by matplotlib.hist. The data were originally integers, so I tried converting them to float as wel, as shown below, but they are still not being distributed among 10 bins.

Interestingly, a small sub-sample (using df.sample(0.00x)) of the integer data are successfully distributed.

Any suggestions on where I may be erring in data preparation or use of matplotlib's histogram function would be appreciated.

histogram output

x = df[(df['UNIT']=='X')].OPP_VALUE.values
num_bins = 10
n, bins, patches = plt.hist((x[(x>0)]).astype(float), num_bins, normed=False, facecolor='0.5', alpha=0.8)
plt.show()

try using `log=True` - your sample contains very few large values which skew the distribution. You may have to think about removing them. — cel, Aug 02 '16 at 17:49
Yup. Looks like you need to zoom in all the way in. Can you print the output of `print(n); print(bins);`. — Mad Physicist, Aug 02 '16 at 17:52
You hit the nail on the head, so much so that log=True even doesn't work: **print(bins)** [ 1.00000000e+00 3.00000000e+09 6.00000000e+09 9.00000000e+09 1.20000000e+10 1.50000000e+10 1.80000000e+10 2.10000000e+10 2.40000000e+10 2.70000000e+10 3.00000000e+10] **print(n)** [ 1.86114000e+05 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.00000000e+00] — A. Slowey, Aug 02 '16 at 18:05

score 5 · Answer 1 · answered May 23 '19 at 01:08

5

Most likely what is happening is that the number of data points with x > 0.5 is very small but you do have some outliers that forces the hist function to pick the scale it does. Try removing all values > 0.5 (or 1 if you do not want to convert to float) and then plot again.

answered May 23 '19 at 01:08

Lakshmi Prakash

51
1
2

Im also facing this issue, could you explian a littile elaborately,, I am plotting after removing outliers using z score and I am getting this – Scope May 14 '21 at 16:00

score -1 · Answer 2 · answered Dec 16 '22 at 04:41

-1

you should modify number of bins, for exam

number_of_bins = 200
bin_cutoffs = np.linspace(np.percentile(x,0), np.percentile(x,99),number_of_bins)

answered Dec 16 '22 at 04:41

Milad Naeimaei

1

Histogram has only one bar

2 Answers2