Questions tagged [duplicates]

The "duplicates" tag concerns detecting and/or dealing with multiple instances of items in collections.

A duplicate is any re-occurrence of an item in a collection. This can be as simple as two identical strings in a list of strings, or multiple complex objects which are treated as the same object when compared to each other.

This tag may pertain to questions about preventing, detecting, removing, or otherwise dealing with unwanted duplicates, or adapting to safely allow duplicates.

15777 questions
554
votes
5 answers

How do I (or can I) SELECT DISTINCT on multiple columns?

I need to retrieve all rows from a table where 2 columns combined are all different. So I want all the sales that do not have any other sales that happened on the same day for the same price. The sales that are unique based on day and price will get…
sheats
  • 33,062
  • 15
  • 45
  • 44
506
votes
13 answers

C# LINQ find duplicates in List

Using LINQ, from a List, how can I retrieve a list that contains entries repeated more than once and their values?
Mirko Arcese
  • 5,133
  • 2
  • 13
  • 11
475
votes
2 answers

Delete all Duplicate Rows except for One in MySQL?

How would I delete all duplicate data from a MySQL Table? For example, with the following data: SELECT * FROM names; +----+--------+ | id | name | +----+--------+ | 1 | google | | 2 | yahoo | | 3 | msn | | 4 | google | | 5 | google | | 6…
Highway of Life
  • 22,803
  • 16
  • 52
  • 80
449
votes
7 answers

Remove pandas rows with duplicate indices

How to remove rows with duplicate index values? In the weather DataFrame below, sometimes a scientist goes back and corrects observations -- not by editing the erroneous rows, but by appending a duplicate row to the end of a file. I'm reading some…
Paul H
  • 65,268
  • 20
  • 159
  • 136
442
votes
28 answers

Remove duplicate rows in MySQL

I have a table with the following fields: id (Unique) url (Unique) title company site_id Now, I need to remove rows having same title, company and site_id. One way to do it will be using the following SQL along with a script (PHP): SELECT title,…
Chetan
  • 4,885
  • 4
  • 22
  • 15
378
votes
8 answers

Remove duplicate elements from array in Ruby

I have a Ruby array which contains duplicate elements. array = [1,2,2,1,4,4,5,6,7,8,5,6] How can I remove all the duplicate elements from this array while retaining all unique elements without using for-loops and iteration?
Mithun Sasidharan
  • 20,572
  • 10
  • 34
  • 52
365
votes
26 answers

What's the most efficient way to erase duplicates and sort a vector?

I need to take a C++ vector with potentially a lot of elements, erase duplicates, and sort it. I currently have the below code, but it doesn't work. vec.erase( std::unique(vec.begin(), vec.end()), vec.end()); std::sort(vec.begin(),…
Kyle Ryan
  • 4,271
  • 5
  • 22
  • 20
347
votes
9 answers

How to find duplicate records in PostgreSQL

I have a PostgreSQL database table called "user_links" which currently allows the following duplicate fields: year, user_id, sid, cid The unique constraint is currently the first field called "id", however I am now looking to add a constraint to…
John
  • 6,417
  • 9
  • 27
  • 32
310
votes
15 answers

Remove duplicates by columns A, keeping the row with the highest value in column B

I have a dataframe with repeat values in column A. I want to drop duplicates, keeping the row with the highest value in column B. So this: A B 1 10 1 20 2 30 2 40 3 10 Should turn into this: A B 1 20 2 40 3 10 I'm guessing there's probably an…
Abe
  • 22,738
  • 26
  • 82
  • 111
283
votes
13 answers

How do I get a list of all the duplicate items using pandas in python?

I have a list of items that likely has some export issues. I would like to get a list of the duplicate items so I can manually compare them. When I try to use pandas duplicated method, it only returns the first duplicate. Is there a a way to get…
BigHandsome
  • 4,843
  • 5
  • 23
  • 30
255
votes
15 answers

How do I check if there are duplicates in a flat list?

For example, given the list ['one', 'two', 'one'], the algorithm should return True, whereas given ['one', 'two', 'three'] it should return False.
teggy
  • 5,995
  • 9
  • 38
  • 41
252
votes
8 answers

Drop all duplicate rows across multiple columns in Python Pandas

The pandas drop_duplicates function is great for "uniquifying" a dataframe. I would like to drop all rows which are duplicates across a subset of columns. Is this possible? A B C 0 foo 0 A 1 foo 1 A 2 foo 1 B 3 bar 1 A As an…
Jamie Bull
  • 12,889
  • 15
  • 77
  • 116
251
votes
5 answers

MySQL ON DUPLICATE KEY UPDATE for multiple rows insert in single query

I have a SQL query where I want to insert multiple rows in single query. so I used something like: $sql = "INSERT INTO beautiful (name, age) VALUES ('Helen', 24), ('Katrina', 21), ('Samia', 22), ('Hui Ling', 25), ('Yumie',…
Prashant
  • 5,331
  • 10
  • 44
  • 59
245
votes
29 answers

How do I remove duplicates from a C# array?

I have been working with a string[] array in C# that gets returned from a function call. I could possibly cast to a Generic collection, but I was wondering if there was a better way to do it, possibly by using a temp array. What is the best way to…
lomaxx
  • 113,627
  • 57
  • 144
  • 179
245
votes
18 answers

Finding duplicate rows in SQL Server

I have a SQL Server database of organizations, and there are many duplicate rows. I want to run a select statement to grab all of these and the amount of dupes, but also return the ids that are associated with each organization. A statement like: …
xtine
  • 2,599
  • 2
  • 18
  • 15