I am planning to implement the below logic to get the student score.
Find students getting more than 60 marks
And then get that student marks too based on Subject Student Key!
Input Data
data = [['Maths', 100,80,20], ['Science', 80,20,10]]
df = pd.DataFrame(data, columns = ['Subject', 'Student A','Student B','Student C'])
df.set_index("Subject",inplace=True)
Student A Student B Student C
Subject
Maths 100 80 20
Science 80 20 10
Get Students getting more than 60 marks
df=df[df.gt(60)]
rank_df = df.rank(axis=0,method='average',pct=False,ascending=False)
marks_list = []
for i in range(0,len(rank_df)):
label_series = rank_df.iloc[i,:]
labels_notna = label_series.sort_values(ascending=True)[label_series.notna()].index
marks_list.append(",".join(labels_notna))
df['Student gt 60'] = marks_list
new_df = df['Student gt 60'].str.split(',', expand = True)
new_df.reset_index(inplace=True)
new_df.columns=["Subject","Top 1","Top 2"]
new_df = pd.melt(new_df, id_vars=['Subject'], value_name='Student')
data= new_df[["Subject","Student"]]
data.loc[~data["Student"].isna()]
Subject Student
0 Maths Student A
1 Science Student A
2 Maths Student B
I am planning to get an associated score for the Subject/Student key in the same data frame but not able to work it out.
Required Output:
Subject Student Score
0 Maths Student A 100
1 Maths Student B 80
2 Science Student A 80
Could somebody help me with some pointers!