I have a dataframe with claim numbers, which is an 12 digit number. I am trying to take out reversed claims, which would be 2 claims of a paid claim and reversed claim. There are instances where a claim was processed and reversed, but then it was reprocessed. These situations have 3 duplicate claim numbers. I want to drop reversed claims, which would only show 2 duplicate claim numbers. This would leave me with only paid claims and claims that were reprocessed. I am having trouble writing the drop_duplicates in python. When I do df.drop_duplicates(subset='claim_number', keep=False, inplace=True), I get rid of reversed claims and reprocessed claims. Any help would be appreciated!
In [2]: df
Out[2]:
A
0 207667742791
1 207667743011
2 207667743361
3 207667743361
4 214063686631
5 214063686631
6 214063686631
Desired Output:
In [2]: df
Out[2]:
A
0 207667742791
1 207667743011
2 214063686631
3 214063686631
4 214063686631