1

I have a df that looks like this:

headings = ['foo','bar','qui','gon','jin']
table = [[1,1,3,4,5],
         [1,1,4,5,6],
         [2,2,3,4,5],
         [2,2,4,5,6],
         ]
df = DataFrame(columns=headings,data=table)

    foo bar qui gon jin
0   1   1   3   4   5
1   1   1   4   5   6
2   2   2   3   4   5
3   2   2   4   5   6

What I want to do is average the values of all columns whenever a certain column has a similar value e.g. I want to average all the columns with similar 'bar' values and then create a dataframe with the answer. I tried the following:

newDf = DataFrame([])

for i in df['bar'].loc[1:2]:
    newDf = newDf.append(df[df['foo'] == i].mean(axis=0),ignore_index=True)

And it outputs what I want:

bar foo gon jin qui
0   1.00E+00    1.00E+00    4.50E+00    5.50E+00    3.50E+00
1   2.00E+00    2.00E+00    4.50E+00    5.50E+00    3.50E+00

But when I try that with another column with value, it does not output what I want:

for i in df['qui'].loc[1:2]:
    newDf = newDf.append(df[df['foo'] == i].mean(axis=0),ignore_index=True)

Produces

    bar foo gon jin qui
0   NAN NAN NAN NAN NAN
1   NAN NAN NAN NAN NAN

Can you give me a hand?

Side question: how do I prevent the columns of the new dataframe to be ordered alphabetically? Is it possible to maintain the order of the original dataframe?

elporsche
  • 43
  • 1
  • 5

0 Answers0