2

I have the followign code:

  import pandas as pd
    
    status = ['Pass','Fail']
    item_info = pd.DataFrame({
        'student': ['John','Alice','Pete','Mike','John','Alice','Joseph'],
        'test': ['Pass','Pass','Pass','Pass','Pass','Pass','Pass']
    })
    
    item_status = pd.crosstab(item_info['student'],item_info['test'])
    print(item_status)

Which produces:

| Student | Pass |
|---------|------|
| Alice   | 2    |
| John    | 2    |
| Joseph  | 1    |
| Mike    | 1    |
| Pete    | 1    |

However, I want to create something that looks like this:

| Student | Pass | Fail | Total |
|---------|------|------|-------|
| Alice   | 2    | 0    | 2     |
| John    | 2    | 0    | 2     |
| Joseph  | 1    | 0    | 1     |
| Mike    | 1    | 0    | 1     |
| Pete    | 1    | 0    | 1     |

How do I change the code so that it includes a Fail column with 0 for all of the students and provides a total?

1 Answers1

3

Generic solution which adds an extra label without knowing the existing labels in advance, with reindex

cols = item_info['test'].unique().tolist()+['Fail'] #adding the extra label
pd.crosstab(item_info['student'],item_info['test']).reindex(columns=cols,fill_value=0)

Or depending on what you want, I assumed you are looking to chain methods:

item_status = pd.crosstab(item_info['student'],item_info['test'])
item_status['Fail'] = 0

test     Pass  Fail
student            
Alice       2     0
John        2     0
Joseph      1     0
Mike        1     0
Pete        1     0
anky
  • 74,114
  • 11
  • 41
  • 70