0

I am using Python for Titanic disaster competition on Kaggle. The dataset (df) contains 3 attributes corresponding to each passenger - 'Gender'(1/0), 'Age' and 'Pclass'(1/2/3). I want to obtain median age corresponding to each Gender-Pclass combination.

The end result should be a dataframe as -

Gender Class
1      1 
0      2
1      3 
0      1
1      2
0      3

Median age will be calculated later

I tried to create the data frame as follows -

unique_gender = pd.DataFrame(df.Gender.unique())
unique_class = pd.DataFrame(df.Class.unique())

reqd_df = pd.merge(unique_gender, unique_class, how = 'outer')

But the output obtained is -

   0
0  3
1  1
2  2
3  0

can someone please help me get the desired output?

cchamberlain
  • 17,444
  • 7
  • 59
  • 72
Rohan Bapat
  • 343
  • 2
  • 4
  • 17

1 Answers1

0

You want df.groupby(['gender','class'])['age'].median() (per JohnE)

Back2Basics
  • 7,406
  • 2
  • 32
  • 45