I have a df that looks like this:
headings = ['foo','bar','qui','gon','jin']
table = [[1,1,3,4,5],
[1,1,4,5,6],
[2,2,3,4,5],
[2,2,4,5,6],
]
df = DataFrame(columns=headings,data=table)
foo bar qui gon jin
0 1 1 3 4 5
1 1 1 4 5 6
2 2 2 3 4 5
3 2 2 4 5 6
What I want to do is average the values of all columns whenever a certain column has a similar value e.g. I want to average all the columns with similar 'bar' values and then create a dataframe with the answer. I tried the following:
newDf = DataFrame([])
for i in df['bar'].loc[1:2]:
newDf = newDf.append(df[df['foo'] == i].mean(axis=0),ignore_index=True)
And it outputs what I want:
bar foo gon jin qui
0 1.00E+00 1.00E+00 4.50E+00 5.50E+00 3.50E+00
1 2.00E+00 2.00E+00 4.50E+00 5.50E+00 3.50E+00
But when I try that with another column with value, it does not output what I want:
for i in df['qui'].loc[1:2]:
newDf = newDf.append(df[df['foo'] == i].mean(axis=0),ignore_index=True)
Produces
bar foo gon jin qui
0 NAN NAN NAN NAN NAN
1 NAN NAN NAN NAN NAN
Can you give me a hand?
Side question: how do I prevent the columns of the new dataframe to be ordered alphabetically? Is it possible to maintain the order of the original dataframe?