-3

I have age range from 18 to 69(inclusive). I want to plot a histogram to show the distribution of these age values.

But I want the bin_edges on histogram to be integers on x-axis and only cover the range 18 to 69 (inclusive). Not like 15 to 75 etc.

I achieve this using the code below:

data = df['age']
num_bins = 5
bin_width = (max(data) - min(data)) / num_bins 
int_bin_edges = [int(min(data) + i * bin_width) for i in range(num_bins + 1)]
plt.hist(data,bins=int_bin_edges,edgecolor='black')
plt.xticks(int_bin_edges)
plt.show()

The problem now is that the bins now have unequal bin widths but the data representation is accurate and I can clearly see that how many data points fall within a certain range represented by a bin.

Is it ok to have unequal bin widths? like 18-28(10 bin width then 28-38(10 bin width) then 38-48(10 bin width) then 48-58(10 bin width) and last 58-69(11 bin width)--> Causing unequal bins widths

Or do you recommend any other solution to this problem?

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
Ali Haider
  • 25
  • 4
  • 4
    You seem to be asking a stats question, not a coding question. Also, you seem to be asking for opinions. This [post](https://stats.stackexchange.com/questions/90617/normalize-histogram-with-different-bin-width) explains histograms, including unequal bin widths. – Joe Aug 31 '23 at 10:23
  • _Is it ok to have unequal bin widths?_ _Or do you recommend any other solution to this problem?_ are asking for opinions, which is off-topic ( not allowed). [What topics can I ask about here?](https://stackoverflow.com/help/on-topic) & [Don't advise on off-topic questions.](https://meta.stackoverflow.com/questions/276572/). Also the question does not contain a complete [mre] – Trenton McKinney Aug 31 '23 at 14:09

0 Answers0