Dividing Dataframes with Different Dimensions

Question

I prefer to use matrix multiplication for coding, because it's so much more efficient than iterating, but curious on how to do this if the dimensions are different.

I have two different dataframes

A:

Orig_vintage
Q12018	185
Q22018.	200

and B:

default_month	1	2	3
orig_vintage
Q12018	0	25	35
Q22018	0	15	45
Q32018	0	35	65

and I'm trying to divide A through columns of B, so the B dataframe becomes (note I've rounded random percentages):

default_month	1	2	3
orig_vintage
Q12018	0	.03	.04
Q22018	0	.04	.05
Q32018	0	.06	.07

But bottom line want to divide the monthly defaults by the total origination figure to get to a monthly default %.

score 0 · Answer 1 · answered Dec 29 '21 at 10:11

first step is get data side by side with a right join()
then divide all columns by required value Divide multiple columns by another column in pandas
required value as I understand is sum, if join did not give a value.

import pandas as pd
import io

df1 = pd.read_csv(
    io.StringIO("""Orig_vintage,Unnamed: 1\nQ12018,185\nQ22018,200\n"""), sep=","
)

df2 = pd.read_csv(
    io.StringIO(
        """default_month,1,2,3\nQ12018,0.0,25.0,35.0\nQ22018,0.0,15.0,45.0\nQ32018,0.0,35.0,65.0\n"""
    ),
    sep=",",
)

df1.set_index("Orig_vintage").join(df2.set_index("default_month"), how="right").pipe(
    lambda d: d.div(d["Unnamed: 1"].fillna(d["Unnamed: 1"].sum()), axis=0)
)

default_month	Unnamed: 1	2	3
Q12018	1	0.135135	0.189189
Q22018	1	0.075	0.225
Q32018	nan	0.0909091	0.168831

Dividing Dataframes with Different Dimensions

1 Answers1