How to create a histogram with an aggregated dataset

Question

I have a dataset new_products which describes the number of months its been since a product launched. I aggregated that data together so that I have 'since_debut' and 'count'. Which describes the number of products that debuted 1, 2, 3....60 month ago. I am having trouble creating a histogram with seaborn.

df = since_debut     count
          1           1784
          2           7345
          3           11111
          4           13255

sns.histplot(data=df, x="since_debut", y="count", bins=30, kde=True)

ValueError: Could not interpret value `since_debut` for parameter `x`

Unsure what is throwing this error and why it can't interpret the aggregated data. Any help or advice is appreciated.

follow up on @Plagon's comment you can just do `sns.countplot(data=df.reset_index(), x="since_debut", y="count", bins=30, kde=True)` to make sure `since_debut` is a column and not an index — mitoRibo, Jan 09 '23 at 20:58
Can you clarify about since_debut being an index and not a real column. Its a calculated field that I used to groupby . Am I missing something?@mitoRibo — Mitchell.Laferla, Jan 09 '23 at 21:10
[groupby](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.groupby.html) sets the `by` variables as index by default. You can either use the proposal from @mitoRibo or set `groupby(..., as_index=False)`. — Plagon, Jan 09 '23 at 21:14

score 0 · Answer 1 · answered Jan 09 '23 at 21:12

0

Since you have already aggregated dataset shouldn't you use something like barplot:

sns.barplot(data=df, x="since_debut", y="count")

countplot should be used on original data and will aggregate data over one of the axis itself.

answered Jan 09 '23 at 21:12

Guru Stron

102,774
10
95
132

How to create a histogram with an aggregated dataset

1 Answers1