8

I've got a dataframe outcome2 that I generate a grouped boxplot with in the following manner:

In [11]: outcome2.boxplot(column='Hospital 30-Day Death (Mortality) Rates from Heart Attack',by='State')
        plt.ylabel('30 Day Death Rate')
        plt.title('30 Day Death Rate by State')
Out [11]:

enter image description here

What I'd like to do is sort the plot by the median for each state, instead of alphabetically. Not sure how to go about doing so.

Chris
  • 9,603
  • 15
  • 46
  • 67
  • What do you mean by "inside of alphabetically"? Do you only want to look at the first letter of each state's name? You can sort by the median or alphabetically, but you can't do both. – Phillip Cloud Oct 19 '13 at 21:02
  • @PhillipCloud typo, sorry. Should have read "instead of alphabetically" as is the default. – Chris Oct 19 '13 at 21:49

1 Answers1

9

To sort by the median, just compute the median, then sort it and use the resulting Index to slice the DataFrame:

In [45]: df.iloc[:10, :5]
Out[45]:
      AK     AL     AR     AZ     CA
0  0.047  0.199  0.969 -0.205  1.053
1  0.206  0.132 -0.712  0.111 -0.254
2  0.638  0.233 -0.907  1.284  1.193
3  1.234  0.046  0.624  0.485 -0.048
4 -1.362 -0.559  1.108 -0.501  0.111
5  1.276 -0.954  0.653 -0.175 -0.287
6  0.524 -1.785 -0.887  1.354 -0.431
7  0.111  0.762 -0.514  0.808  0.728
8  1.301  0.619  0.957  1.542 -0.087
9 -0.892  2.327  1.363 -1.537  0.142

In [46]: med = df.median()

In [47]: med.sort()

In [48]: newdf = df[med.index]

In [49]: newdf.iloc[:10, :5]
Out[49]:
      PA     CT     LA     RI     MO
0 -0.667  0.774 -0.999 -0.938  0.155
1  0.822  0.390 -0.014 -2.228  0.570
2 -1.037  0.838 -0.673  2.038  0.809
3  0.620  2.845 -0.523 -0.151 -0.955
4 -0.918  1.043  0.613  0.698 -0.446
5 -0.767  0.869 -0.496 -0.925 -0.374
6 -0.495  0.437  1.245 -1.046  0.894
7 -1.283  0.358  0.016  0.137  0.511
8 -0.018 -0.047 -0.639 -0.385  0.080
9 -1.705  0.986  0.605  0.295  0.302

In [50]: med.head()
Out[50]:
PA   -0.117
CT   -0.077
LA   -0.072
RI   -0.069
MO   -0.053
dtype: float64

The resulting figure:

enter image description here

Phillip Cloud
  • 24,919
  • 11
  • 68
  • 88
  • This is a good solution, but it throws a KeyError if there is any missing data in your dataframe. I'm hoping to find a work-around for that problem. – rocksNwaves Mar 17 '20 at 01:23