How to transform a dataframe so that values of a column are the column names and column names are rows but unstacked?

Question

I'm comparing two tables to monitor for changes. I need a resulting table that shows a before and after the change per user per attribute changed.

I used the pandas .compare method and below is an example of my current result, but I can't figure out what the next steps would be. If I'm going about this completely wrong feel free to let me know as well.

I currently have a table that looks like this:

User ID	Name	Favorite	Status
1	Steve	Chocolate	Before
2	John	Chocolate	Before
1	Stevy	Chocolate	After
2	John	Vanilla	After

I need it to look like this:

User ID	Attribute Changed	Before	After
1	Name	Steve	Stevy
2	Name	John	John
1	Favorite	Chocolate	Chocolate
2	Favorite	Chocolate	Vanilla

I can eliminate the lines where no changes were made.

score 2 · Answer 1 · answered Apr 03 '23 at 17:17

2

Maybe via pivot?

import pandas as pd

df = pd.read_clipboard() # Your df here

out = (
    df
    .pivot(index="User ID", columns="Status")
    .swaplevel(axis=1)
    .stack()
    .reset_index(names=["User ID", "Attribute Changed"])
    .rename_axis(None, axis=1)
)

out:

   User ID Attribute Changed      After     Before
0        1          Favorite  Chocolate  Chocolate
1        1              Name      Stevy      Steve
2        2          Favorite    Vanilla  Chocolate
3        2              Name       John       John

answered Apr 03 '23 at 17:17

Chrysophylaxs

5,818
3
10
21

1

Similar, Melting then pivoting: `df.melt(['User ID', 'Status']).pivot(index=["User ID", "variable"], columns="Status", values='value').reset_index().rename_axis(None, axis=1)` – BeRT2me Apr 03 '23 at 17:22
Eliminate no-changes after pivoting: `df[df.After.ne(df.Before)]` – BeRT2me Apr 03 '23 at 17:28
1

Apparently I don't have the reputation to upvote but this worked perfectly. Thank you! – Yisroel Len Apr 03 '23 at 17:35

score 0 · Answer 2 · answered Apr 03 '23 at 17:32

You could unstack and concatenate like

# import pandas as pd
# df = pd.read_html('https://stackoverflow.com/questions/75922021')[0]

unstacked_df = df.set_index(['User ID','Status']).unstack()

kList = {c[0] for c in unstacked_df.columns} ## {'Name','Favorite'}
dfList = [unstacked_df[k] for k in kList] ## [u__df['Name'],u__df['Favorite']]
conc_df = pd.concat(dfList, keys=kList, names=['Attribute Changed', 'User ID'])

df2Cols = ['User ID', 'Attribute Changed', 'After', 'Before']
df2 = conc_df.reset_index().rename_axis(None,axis='columns')[df2Cols]

`unstacked_df`	`conc_df`	`df2`

How to transform a dataframe so that values of a column are the column names and column names are rows but unstacked?

2 Answers2