3
for sheet in prio_list:
        df_pivot_comp = pd.pivot_table(pd.read_excel(excel_file,sheet_name=sheet),\
        values=['COMPILER WARNING'], columns=None, index=['GROUP','DOMAIN'],\
        aggfunc='count',margins=True,margins_name='Total',fill_value=0)

To this code , output would be

enter image description here

I want the output like :

enter image description here

I tried :

group_count = df_pivot_comp.sum(level='GROUP')

But it's not working Please help me with this

Ynjxsjmh
  • 28,441
  • 6
  • 34
  • 52

1 Answers1

0

With the following toy dataframe:

import pandas as pd

df = pd.DataFrame(
    {
        "A": ["foo", "foo", "foo", "foo", "foo", "bar", "bar", "bar", "bar"],
        "B": ["one", "two", "three", "two", "five", "six", "seven", "six", "seven"],
        "C": [1, 2, 2, 3, 3, 4, 5, 6, 7],
    }
)

print(df)
# Output

     A      B  C
0  foo    one  1
1  foo    two  2
2  foo  three  2
3  foo    two  3
4  foo   five  3
5  bar    six  4
6  bar  seven  5
7  bar    six  6
8  bar  seven  7

First, create a pivot table with Pandas pivot_table and use margins=True and margins_name="total" to sum rows and columns:

table = pd.pivot_table(
    df,
    values="C",
    index=["A", "B"],
    aggfunc=np.sum,
    margins=True,
    margins_name="total",
)

print(table)
# Output

              C
A     B        
bar   seven  12
      six    10
foo   five    3
      one     1
      three   2
      two     5
total        33

Then, using Pandas advanced indexing:

# Drop column "total" and sort index to avoid performance warning
table = table.sort_index()

# Calculate total for rows and add in a new column
rows_total = table.reset_index().groupby(["A"]).agg({"C": sum})
for idx in table.index.get_level_values(0).unique():  # bar, foo, total
    table.loc[(idx,), "total"] = rows_total.loc[idx].values[0]

# Sort index and format values
table = table.reindex(index=["bar", "foo", "total"], level=0).astype(int)

# Remove repeated values
for idx_0 in table.index.get_level_values(0).unique():  # bar, foo, total
    if idx_0 == "total":
        continue
    for idx_1 in table.loc[(idx_0,), :].index[1:]:
        table.loc[(idx_0, idx_1), "total"] = ""

Finally:

print(table)
# Output

              C total
A     B
bar   seven  12    22
      six    10      
foo   five    3    11
      one     1      
      three   2      
      two     5      
total        33    33
Laurent
  • 12,287
  • 7
  • 21
  • 37