I'm looking for the rationale about the method used by pandas profiling tool to identify duplicates rows (in a dataframe with multiple columns)? I couldn't find it in Pandas Profiling documentation.
Asked
Active
Viewed 329 times
1 Answers
0
See model/summary line 571-575.
In other cases, can be simplified as sum(df.duplicated())

loopy
- 441
- 3
- 8