Questions tagged [duplicate-data]
354 questions
2
votes
1 answer
Display a table with just the second duplicate rows removed yet keep the first row
So, I have a table with 3 columns, of which the first column consists of IDs and the last column consists of dates. What I need is, to sort the table by dates, and remove any duplicate IDs with a later date (and keep the ID with the earliest…

Nams
- 47
- 1
- 9
2
votes
3 answers
Data deduplication algorithms
I'd like to find data deduplication algorithms, mostly to find duplicate files. Looks like the first step is to identify the files with the same timestamps, sizes and file names. I can do an MD5 checksum on those files and compare. In addition to…

Roman Kagan
- 10,440
- 26
- 86
- 126
2
votes
3 answers
Detecting a duplicate customer
I have a bunch of customer data that is normalized into multiple tables. I want to decide the best criteria for make a best guess that a customer might be the same. There needs to be a balance between minimizing the number of duplicates but also…

Christopher Martin
- 927
- 1
- 7
- 9
1
vote
4 answers
Using Perl to cleanup a filesystem with one or more duplicates
I have two disks, one an ad-hoc backup disk, which is a mess with duplicates everywhere and another disk in my laptop which is an equal mess. I need to backup unique files and delete duplicates. So, I need to do the following:
Find all non-zero…
zoot
1
vote
1 answer
Excel Format duplicate values - change the text in the cell
It is very easy to format the cells that have duplicated values (like setting specific background on them or something other style) using the "Conditional formatting", but how can I change their text?
For example:
A1 2332
A2 2333
A3 2334
A4…

gotqn
- 42,737
- 46
- 157
- 243
1
vote
4 answers
How to copy list of dictionaries
I have two dictionaries. When I change a value in dictionary 1, the same change appears in dictionary 2. How do I change a value only in dictionary 1, not in dictionary 2 as well?
List> ld1 = new List

Radicz
- 145
- 1
- 9
1
vote
0 answers
Fast, duplicate requests create duplicate records that shouldn't be valid
There are validations in place that should not allow a duplicate record in a particular table to be created based on two different ID values (one is a user ID) and a state (some states are allowed to have duplicates, but not others).
When we get two…

Chris Butler
- 733
- 6
- 23
1
vote
1 answer
Duplicated records for a column
I'm trying to get duplicated values in col1 for a certain col2 value.
Suppose that I have that table:
+----+------------+----------+
| id | col1 | col2 |
+----+------------+----------+
| 1 | 5 | 2 |
| 2 | 5 | 1 …

tuze
- 1,978
- 2
- 15
- 19
1
vote
5 answers
Efficient checking of possible duplicate entities
I have a requirement to produce a list of possible duplicates before a user saves an entity to the database and warn them of the possible duplicates.
There are 7 criteria on which we should check the for duplicates and if at least 3 match we should…

JonC
- 809
- 8
- 18
1
vote
2 answers
Textbox in modal popup inserts duplicate entries
My textbox is supposed to enter one value and enters about 8 of the same thing. Anyone know why?
Feature

Jamie
- 1,579
- 8
- 34
- 74
1
vote
4 answers
Simple duplicate word checker in PHP
Is there anyway of detecting a duplicate form submit with PHP?
I have a textfield which is used for typing a one-word phrase into it (eg. "Google"). When the user presses the submit button, the form containing that textfield gets submitted.
After…

Akos
- 1,997
- 6
- 27
- 40
1
vote
3 answers
Identify duplicate records and update them with the ID of first occurrence
I have a table like this.
ID Name Source ID
1 Orange 0
2 Pear 0
3 Apple 0
4 Orange 0
5 Apple 0
6 Banana 0
7 Orange 0
What I want to do is:
For the records with FIRST occurrence of…

Jag
- 89
- 3
- 8
1
vote
0 answers
ARRGH! Duplicate results in from a query, but only sometimes
UPDATE>>>> If I run this on localhost, their is no duplication of data, but on the site there is. Does that help to give people an idea of what may be going on?
I have an application built by someone else that is returning results from a…

Andrew
- 63
- 6
1
vote
0 answers
Handling duplicate keys in quicksort
A naïve quicksort will take O(n^2) time to sort an array containing no unique keys, because all keys will be partitioned either before or after the pivot value. There are ways to handle duplicate keys (like one described in Quicksort is Optimal).…

Derek
- 101
- 1
- 7
1
vote
0 answers
Identify duplicate records in multiple databases?
I am working on Election Department databases in India. I am asked to find the duplicate records of one database with respect to other databases of a state depending on elector name, his guardian name and age. In a state is divided in assembly…

user837414
- 39
- 5