0

Before concatenating both Dataframes have 7841 rows but after concatenation, the number of rows suddenly increased to 9005.

trans_features=['Customer_Age', 
            'Dependent_count',
            'Contacts_Count_12_mon',
            'Months_Inactive_12_mon',
            'Credit_Limit',
            'Total_Revolving_Bal',
            'Total_Amt_Chng_Q4_Q1',
            'Total_Trans_Amt',
            'Total_Ct_Chng_Q4_Q1',
            'Avg_Utilization_Ratio']

df_trans_feat = df_bank[trans_features]

pt = PowerTransformer()   # By default it's yeo-johnson transformation
transformed = pt.fit_transform(df_trans_feat)

df_transformed = pd.DataFrame(transformed, 
columns=df_trans_feat.columns)
print("df_transformed:", df_transformed.shape)
df_bank.drop(df_trans_feat, axis=1, inplace=True)
print("df_bank:", df_bank.shape)

df_bank_comb = pd.concat([df_bank, df_transformed], axis=1)
print("df_bank_comb:", df_bank_comb.shape)

Output:
df_transformed: (7841, 10)
df_bank: (7841, 7)
df_bank_comb: (9005, 17)

The increase in number of rows is baffling. My intention is to combine the two dataframes horizontally. Is there any problem in my concat statement?

James Z
  • 12,209
  • 10
  • 24
  • 44
  • please provide a reproducible example if the duplicate does not help you to solve your problem – mozway Dec 06 '22 at 13:12
  • Solution provided in the tagged duplicate thread did not solve the problem. It was mentioned in the thread to use the following: pd.concat([df1,df2], axis=1). Having used the same in my code, the problem still persists! – Penny Cooper Dec 06 '22 at 13:24
  • please provide a reproducible sample of `df_bank`, `df_transformed` that recapitulates the problem – mozway Dec 06 '22 at 13:30

0 Answers0