0

I have data as below, with 7 columns. I want to merge all rows that have the same value in columns 1-6, and sum the corresponding column-7 data.

0.1  0.2  0.3  0.1  0.2  0.3  0.001
0.1  0.2  0.3  0.1  0.2  0.3  0.002
0.9  0.9  0.9  0.1  0.1  0.1  0.002
0.9  0.9  0.9  0.8  0.8  0.8  0.1 
0.9  0.9  0.9  0.8  0.8  0.8  0.2 
0.1  0.2  0.3  0.1  0.2  0.3  0.001
0.1  0.2  0.3  0.1  0.1  0.1  0.002
0.9  0.9  0.9  0.1  0.1  0.1  0.002

I had tried this code:

import pandas as pd

df1 = pd.read_csv('test-data',sep='\s+',header=None,engine='python')
key1_sum = df1.groupby([0,1,2,3,4,5]).sum()

print(key1_sum)

I expected to get this output:

0.1 0.2 0.3 0.1 0.1 0.1  0.002
0.1 0.2 0.3 0.1 0.2 0.3  0.004
0.9 0.9 0.9 0.1 0.1 0.1  0.004
0.9 0.9 0.9 0.8 0.8 0.8  0.300

...but I actually got this output:

                             6
0   1   2   3   4   5         
0.1 0.2 0.3 0.1 0.1 0.1  0.002
                0.2 0.3  0.004
0.9 0.9 0.9 0.1 0.1 0.1  0.004
            0.8 0.8 0.8  0.300

So I just want to not output the head, and complete the elements in the corresponding column.

Could you please give me some suggestions about that?

smci
  • 32,567
  • 20
  • 113
  • 146
zanghang
  • 11
  • 1
  • 3
  • [pandas groupby()](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.groupby.html) by default has `as_index=True, group_keys=True`. So if you grouped by multiple columns, that default will gives you an unwanted MultiIndex. Set `as_index=False` to get “SQL-style” grouped output (no funky row index). – smci Jun 29 '19 at 21:05

2 Answers2

5

When we try to groupby is we do not want the groupby key as index, we can pass to the as_index=False

key1_sum = df1.groupby([0,1,2,3,4,5],as_index=False).sum()
cs95
  • 379,657
  • 97
  • 704
  • 746
BENY
  • 317,841
  • 20
  • 164
  • 234
0

pandas groupby() by default has as_index=True, group_keys=True:

  • so if you grouped by multiple columns, the default will give you an (often unwanted) MultiIndex
  • set as_index=False to get “SQL-style” grouped output (no funky row index)
smci
  • 32,567
  • 20
  • 113
  • 146