Let's say we have following dataframe:
| | iiD | Suppressant | Suppressant_New |
|---:|-------:|:--------------------------|-----------------|
| 0 | 0 | EI402 | EI503 |
| 1 | 0 | EI503 | EG-CA 422 |
| 2 | 0 | EG-CA 422 | EG-CAX |
| 6 | 0 | EG-CAX | None |
| 7 | 1 | EH333 | ET777 |
| 8 | 1 | ET777 | EH422 |
| 8 | 1 | EI503 | EG-CA 422 |
| 9 | 1 | EG-CA 422 | None |
Now I want to count the duplicate rows and add the count value as a seperate column preserving iiD
.
So result should look like this:
| | iiD | Suppressant | Suppressant_New |count |
|---:|-------:|:--------------------------|-----------------|------|
| 0 | 0 | EI402 | EI503 | 1
| 1 | 0 | EI503 | EG-CA 422 | 2
| 2 | 0 | EG-CA 422 | EG-CAX | 1
| 6 | 0 | EG-CAX | None | 1
| 7 | 1 | EH333 | ET777 | 1
| 8 | 1 | ET777 | EH422 | 1
| 8 | 1 | EI503 | EG-CA 422 | 2
| 9 | 1 | EG-CA 422 | None | 1
What is an efficient way to do this with Pandas?