0

I have the following DataFrame:

df

I'd like to display the number of people who are single with management positions that do and do not own their own homes, in a series. I currently have the following code:

df['housing'].groupby([df['marital'], df['job']]).value_counts()

However this is currently counting the number of homeowners/non-homeowners per job and marital status. I'm only concerned with the single people with management positions.

How can I apply a filter to the resulting series so it only shows the data I am interested in?

  • Welcome to SO! When asking questions please paste in sample data as pandas dataframe, dictionary, or at least raw data instead of linking to a picture. – semblable Apr 03 '21 at 01:09
  • Does this answer your question? [Conditional grouping](https://stackoverflow.com/questions/45083000/pandas-groupby-with-conditional-formula) – SKPS Apr 03 '21 at 01:09
  • @SarahSawyer `pd.DataFrame()` is best, but as a dictionary or printed output works, too. Check out these pages for help getting started: [MCVE], [ask], [help] – semblable Apr 03 '21 at 01:27
  • 1
    @k_n_c Thank you, I am new to SO so I wasn't sure how to enter the dataframe. It's also very large – Sarah Sawyer Apr 03 '21 at 02:38
  • @SarahSawyer That's the "minimal" part of minimal reproducible example. Usually five or six rows are suggested unless more are required to reproduce an error or situation. – semblable Apr 03 '21 at 02:40

1 Answers1

2

Try query:

df.query('job=="management" and marital=="single"')['housing'].value_counts()

Or you can use loc:

df.loc[df['job'].eq('management') & df['marital'].eq('single'),
       'housing'].value_counts()

Note your approach can also work if you slice the data afterward:

(df['housing'].groupby([df['marital'], df['job']])
   .value_counts()
   .loc[('single','management')]
)
Quang Hoang
  • 146,074
  • 10
  • 56
  • 74
  • @QuangHoang Can explain a bit more about how the tuple is working in the `.loc` with the `groupby`/`value_counts`? – semblable Apr 03 '21 at 01:50
  • 1
    @k_n_c see more details [in the doc](https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#why-does-assignment-fail-when-using-chained-indexing). In short, you can slice MultiIndex with tuples. – Quang Hoang Apr 03 '21 at 02:33