df[df['CustID'].duplicated(keep=False)]
This finds the rows in the data frame where there exist duplicates in the CustID
column. The keep=False
tells the duplicated
function to mark all duplicate rows as True
(as opposed to just the first or last ones):
CustID Purchase Time
0 A Item1 01/01/2011
3 A Item2 03/01/2011
EDIT
Looking at the docs for duplicated
it looks like you can also do:
df[df.duplicated('CustID', keep=False)]
Though this seems to be about 100 µs slower than the original (458 µs vs. 545 µs based on the example dataframe)