I'm working on a python implementation of an agent-based model using the 'mesa' framework (available in Github). In the model, each "agent" on a grid plays a Prisoner's Dilemma game against its neighbors. Each agent has a strategy that determines its move vs. other moves. Strategies with higher payoffs replace strategies with lower payoffs. In addition, strategies evolve through mutations, so new and longer strategies emerge as the model runs. The app produces a pandas dataframe that gets updated after each step. For example, after 106 steps, the df might look like this:
step strategy count score
0 0 CC 34 2.08
1 0 DD 1143 2.18
2 0 CD 1261 2.24
3 0 DC 62 2.07
4 1 CC 6 1.88
.. ... ... ... ...
485 106 DDCC 56 0.99
486 106 DD 765 1.00
487 106 DC 1665 1.31
488 106 DCDC 23 1.60
489 106 DDDD 47 0.98
Pandas/matplotlib creates a pretty good plot of this data, calling this simple plot function:
def plot_counts(df):
df1 = df.set_index('step')
df1.groupby('strategy')['count'].plot()
plt.ylabel('count')
plt.xlabel('step')
plt.title('Count of all strategies by step')
plt.legend(loc='best')
plt.show()
I get this plot:
Not bad, but here's what I can't figure out. The automatic legend quickly gets way too long and the low-frequency strategies are of little interest, so I want the legend to (1) include only the top 4 strategies listed in the above legend and (2) list those strategies in the order they appear in the last step of the model, based on their counts. Looking at the strategies in step 106 in the df, for example, I want the legend to show the top 4 strategies in order DC,DD,DDCC, and DDDD, but not include DCDC (or any other lower-count strategies that might be active).
I have searched through tons of pandas and matplotlib plotting examples but haven't been able to find a solution to this specific problem. It's clear that these plots are extremely customizable, so I suspect there is a way to do this. Any help would be greatly appreciated.