I upvoted this question because there's a very subtle difference between pivot
and groupby
. I think you're looking for something similar to this:
df.groupby('age').size().plot.bar(width=1)
plt.show()
However, I do not think there's a reasonable way to get the same result by grouping because hist()
needs the observations in its raw form, while groupby
is designed to be followed by a function that will transform the data (such as count
, min
, mean
, etc.).
To see this, notice that by grouping by age
and then using count
, you no longer have the raw array of ages anymore. For instance, there are 13 observations of people who are 40 years of age. The raw data looks like (40, 40, ... , 40, 40)
, while the grouped count looks like:
age count
40 13
This is not what the data should look like for a histogram. Another key difference are the bins in a histogram. As you can see, the first plot counts all the observations of people with ages between 0 and 10. By grouping by age, you would have 11 bins inside this bin: one for people aged 0, one for people aged 1, one for people aged 2, etc.
To summarize, groupby
expects a function that will transform the original data, but in order to plot a histogram, you need the data in its crude state. For this reason, pivot
is the go-to solution for this kind of task, as it also splits the data by survived
, but does not apply any functions the data.