2

I have a df that looks like:

  CD1  CD2  CD3   ...  FG1  FG2
0 3.8  2.9  0     ...  0.1  0.1
1 0.1  0    4.1   ...  5.2  0
# 35 columns and 2 rows

And I plot a stacked bar chart using:

colors = plt.cm.jet(np.linspace(0, 1, 35))
df3.plot(kind='barh',stacked=True, figsize=(15,10),color=colors, width=0.08)

But my issue is that this plots all 35 columns however I want to only plot the n columns with the highest values e.g. only plot CD1 and CD2 for row 0 and CD3 and FG1 for row 1...

  CD1  CD2  CD3   ...  FG1  FG2
0 3.8  2.9  -     ...  -     -
1  -    -   4.1   ...  5.2   -

Is there a way to do this?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
startswithH
  • 309
  • 4
  • 14

1 Answers1

1

If I understand what you're asking for... It seems you can accomplish this by getting the max for each column followed by nlargest to pick the top 10 columns:

df.max().nlargest(10)

The result should be a Series indexed by column names, so it should be easy to plot that data.

filbranden
  • 8,522
  • 2
  • 16
  • 32