1

In python I am trying to rebalance a dataset which contains approximately 4000 transactions for a single credit card number, which are all ordered by time.

There is a large class imbalance between genuine and fraudulent transaction, and this data only contains about 15 fraudulent transactions that occurred within two days time.

Naturally, I want to rebalance the dataset. However, when I did this using SMOTE, I noticed that now there are approximately 4000 additional synthetic fraudulent transactions that occur during the exact two days as the original fraudulent transactions.

Is there any way to generate synthetic fraudulent transactions which are more randomised than this?

math189925
  • 149
  • 5

0 Answers0