I am running a ML experiment in python and I am stuck with data that have overlaps. I am having a dataframe with multiple columns and the rows between entries are to a big extent similar to subsequent rows.
Are there pandas functions that can split my data frame to two sets trying to reduce the overlaps between the two sets, in a sense that the overall overlaps between the two sets will be as small as possible?
Unfortunately I can not share the dataset but if you can pinpoint me to relevant functions that will be enough for me to continue searching and reading.
I would like to thank you in advance for your reply Regards Alex