Python - duplicated lines

Question

I am new on Python. I would like to find the duplicated lines in a data frame. To explain myself, I have the following data frame

type(data)
pandas.core.frame.DataFrame

data.head()

   User Hour    Min Day Month   Year    Latitude    Longitude
0   0   1   48  17  10  2010    39.75000    -105.000000
1   0   6   2   16  10  2010    39.90625    -105.062500
2   0   3   48  16  10  2010    39.90625    -105.062500
3   0   18  25  14  10  2010    39.75000    -105.000000

I would like to find the duplicated lines in this data frame and to return the 'User' that corresponds to this line.

Thanks a lot,

There are no duplicated lines in this data. What constitutes duplicate in this case? Did you check the docs? http://pandas.pydata.org/pandas-docs/version/0.17.1/generated/pandas.DataFrame.duplicated.html — Woody Pride, Dec 17 '15 at 21:05
Did you check other SO questions such as this one that provides a very clear answer: http://stackoverflow.com/questions/26244309/how-to-analyze-all-duplicate-entries-in-this-pandas-dataframe — Woody Pride, Dec 17 '15 at 21:05
I tried the Counter from collection to begin with, so it can give me the **number** of duplicated data. But Counter works only on single column data frame. @Chris — Mitch, Dec 17 '15 at 21:05

score 0 · Answer 1 · answered Dec 17 '15 at 21:18

0

Is this what you are looking for?

user = data[data.duplicated()]['User']

answered Dec 17 '15 at 21:18

screenpaver

1,120
8
14

Python - duplicated lines

1 Answers1