Reconstruct a dataframe from a contingency table in Python

Question

I would like to reconstruct a dataframe from a contingency table stored as dataframe. For example from ctab I would like to build df1 or df2. Is there a command to do that or do I need a loop?

import pandas as pd
ctab = pd.DataFrame([[1,2], [2, 1]], columns=["A", "B"], index=["A", "B"])
print(ctab)
df1 = pd.DataFrame([["A","A", 1], ["A","B", 2], ["B","A", 2], ["B","B", 1]], columns=["col", "index", "freq"])
print(df1)
df2 = pd.DataFrame([["A","A"], ["A","B"], ["A","B"], ["B","A"], ["B","A"], ["B","B"]], columns=["col", "index"])
print(df2)

mozway · Accepted Answer · 2023-04-20T06:41:33.813

You can use rename_axis, stack, and reset_index:

out = ctab.rename_axis(index='index', columns='col').stack().reset_index(name='freq')

Output:

  index col  freq
0     A   A     1
1     A   B     2
2     B   A     2
3     B   B     1

For the second one, replicate the rows with Index.repeat:

out = ctab.rename_axis(index='index', columns='col').stack().reset_index(name='freq')

out = out.loc[out.index.repeat(out.pop('freq'))]

Output:

  index col
0     A   A
1     A   B
1     A   B
2     B   A
2     B   A
3     B   B

Reconstruct a dataframe from a contingency table in Python

1 Answers1