I'm struggling with reshaping a pandas
data frame. Basically, I'm experimenting with sports statistics and I'm trying to tidy up box scores. I have a data frame which has visitor team and home team listed, along with their scores, all in unique columns like this:
df = pd.DataFrame(data={'GameID': [1, 2], 'Date': ['9/10/2020', '9/13/2020'], 'Visitor': ['Houston Texans', 'Seattle Seahawks'], 'Score_V': [20,38], 'Home':['Kansas City Chiefs', 'Atlanta Falcons'], 'Score_H':[34,25]})
I'm reshaping the data frame so that the teams and scores all end up in their own column, so that I end up with something like this:
df2 = pd.DataFrame(data={'GameID': [1, 1, 2, 2], 'Date': ['9/10/2020', '9/10/2020', '9/13/2020', '9/13/2020'], 'Team': ['Houston Texans', 'Kansas City Chiefs', 'Seattle Seahawks', 'Atlanta Falcons'], 'Location':['Away', 'Home', 'Away', 'Home'], 'Score': [20,34,38,25]})
I came up with the following solution, where I melted the data frame and used .loc()
logic to find and replace values in the various columns. But I feel like this solution is not very elegant and that I missed some obvious capability of pandas.melt()
.
df = df.melt(id_vars=['GameID','Date','Home','Visitor'])
df = df.rename(columns={"variable": "Location", "value": "Score"})
df.loc[df['Location'] == 'Score_V', 'Team'] = df['Visitor']
df.loc[df['Location'] == 'Score_V', 'Location'] = 'Away'
df.loc[df['Location'] == 'Score_H', 'Team'] = df['Home']
df.loc[df['Location'] == 'Score_H', 'Location'] = 'Home'
df = df.drop(columns=['Home', 'Visitor'])
Is there a simpler solution than this?