I'm concatenating two dataframes, so I want to one dataframe is located to another. But first I did some transformation to initial dataframe:
scaler = MinMaxScaler()
real_data = pd.DataFrame(scaler.fit_transform(df[real_columns]), columns = real_columns)
And then concatenate:
categorial_data = pd.get_dummies(df[categor_columns], prefix_sep= '__')
train = pd.concat([real_data, categorial_data], axis=1, ignore_index=True)
I dont know why, but number of rows increased:
print(df.shape, real_data.shape, categorial_data.shape, train.shape)
(1700645, 23) (1700645, 16) (1700645, 130) (1703915, 146)
What happened and how fix the problem?
As you can see number of columns for train equals to sum of columns real_data and categorial_data