Questions tagged [drop-duplicates]

questions related to removing (or dropping) unwanted duplicate values

A duplicate is any re-occurrence of an item in a collection. This can be as simple as two identical strings in a list of strings, or multiple complex objects which are treated as the same object when compared to each other.

This tag may pertain to questions about removing unwanted duplicates.

Removing duplicates between two workbooks

I need help removing duplicates between two workbooks: "Master" workbook and "Copy" workbook. I'm looking for values in these columns that are duplicates: Copy Column D = Master Column A Copy Column O = Master Column C Copy Column R = Master…

excel vba duplicates worksheet drop-duplicates

asked Jun 30 '23 at 18:18

jessehhhca

votes

0 answers

Python - Remove duplicate from dataframe for specific values stored in list

I am working with a dataset where the dataframe contains mulitple duplicates for entries. The entries whose duplicates I need to remove are stored in a list. I can't seem to find a way to remove the duplicates in the dataframe. The methods I have…

dataframe list duplicates drop-duplicates

asked Jun 24 '23 at 11:11

Buddy

votes

2 answers

What is a more efficient way to remove duplicates from a CSV file based on specific fields using a batch script, (and gawk, if needed)?

I have two csv documents that contain lists of files from a source and destination in Google Drive generated by GAM. One is called copytoarchive.csv and lists all relevant files in the source. The other is alreadyinarchive.csv and lists all relevant…

csv batch-file awk drop-duplicates

asked Jun 01 '23 at 02:23

Joshua Howard

votes

1 answer

drop nearly duplicates (pandas)

I have a dataframe with three columns: 'id', 'subject', 'delta', I would like a function that considers lines where id and subject are repeated as duplicates, but delta, which is an integer, can be considered as duplicates if the difference between…

pandas drop-duplicates

asked May 16 '23 at 16:47

rafa.mf_

votes

2 answers

Remove float duplicates from a list of tuples created by zip

I create a list of tuples by zipping three lists together, data pairs: XYZip = list(zip(XaData, Y1aData, Y2aData)) [ (0.001625625, 4.782947316198166, -0.011032947316198166), (-2.5e-06, 4.783447358402665, 0.020216552641597337), …

python-3.x list set tuples drop-duplicates

asked Apr 11 '23 at 02:53

casandra9

votes

1 answer

postgresql INSERT INTO all columns from a table

I am trying to write a method that removes duplicates from tables, without having to know the details of the table for generality (i.e., it should run on any table). I am using the following method from here (last method) through psycopg2: CREATE…

sql postgresql drop-duplicates

asked Apr 06 '23 at 11:02

Aaron Bramson

1,176
3
20
34

votes

1 answer

drop_duplicates not dropping the duplicate records of the same dtype object

I have following dataframe: DF1: col1 | col2 | col3 1 2 3 4 5 6 40 50 60 when I print the dtypes of this columns, all of them are objects. Now, I want to add new row(input as dataframe), so I…

python pandas dataframe drop-duplicates

asked Mar 10 '23 at 19:12

Jay Patel

votes

2 answers

Duplicated float values in pandas even after drop it

I have a column with float values which is so strange, because even if I set type of variable and dropped duplicated, I have still duplicated values. I put the print screen with code and strange result. I tried using different types of variable and…

python pandas dataframe drop-duplicates

asked Mar 05 '23 at 14:37

Mateusz Szymczak

votes

0 answers

Drop_duplicates + groupby -->TypeError: sequence item 0: expected str instance, int found

My ex-colleague wrote a code which imports an excel file and makes some changes on it. During the process we started receiving such an error. Do you have any idea how I can fix it? Here is the problematic part of the code.. ### Concat LI related…

python-3.x aggregate drop-duplicates

asked Feb 13 '23 at 17:14

Berk Kalyoncu

votes

0 answers

How can a duplicate row be dropped with some condition

I have a DF that looks like the following table Name Year Alice 2019 Bob 2020 John 2021 Bob 2022 I would like for each unique 'Name' to check which 'Year' is higher and drop the row with the lower 'Year'. For example can I drop the…

python pandas dataframe drop drop-duplicates

asked Jan 25 '23 at 10:06

Aleksandra Salkina

votes

1 answer

How to drop_duplicates in python

I have to compare to csv files, which I need to drop the duplicate rows and generate another file. #here I´m comparing the csv files. The oldest_file and the newest_file different_data_type = newest_file.equals(other = oldest_file) #If they have…

python pandas drop-duplicates

asked Jan 23 '23 at 12:50

Matheus

votes

2 answers

Using `drop_duplicates` on a Pandas dataframe isn't dropping rows

Situation I have dataframe similar to below ( although I've removed many of the rows for this example, as evidenced in the 'index'…

python pandas dataframe drop-duplicates

asked Jan 09 '23 at 15:08

dsx

votes

2 answers

Pandas Drop Duplicates And Store Duplicates

i use the pandas.DataFrame.drop_duplicates to search duplicates in a dataframe. This removes the duplicates from the dataframe. This also works great. However, I would like to know which data has been removed. Is there a way to save the data in a…

python pandas dataframe duplicates drop-duplicates

asked Dec 16 '22 at 07:58

corsin sauber

votes

1 answer

How to group by first column, select latest value of second column, and all respective values of third column

I have a df: {'ID': {0: 'A', 1: 'A', 2: 'A', 3: 'B', 4: 'B', 5: 'B', 6: 'C', 7: 'C', 8: 'C', 9: 'C'}, 'Date': {0: Timestamp('2020-03-02 00:00:00'), 1: Timestamp('2021-04-03 00:00:00'), 2: Timestamp('2021-04-03 00:00:00'), …

python pandas sorting group-by drop-duplicates

asked Dec 14 '22 at 23:35

Shichimi

votes

1 answer

How do I drop only contiguous rows (all but one) in a pandas DataFrame according to column values?

I have a DataFrame that looks like this: Column1 Column2 0 cat A 1 cat B 2 cat C 3 dog D 4 dog E 5 cat F I want to drop all but one of the contiguous rows where Column 1 has…

pandas drop-duplicates

asked Dec 07 '22 at 16:35

D. Schreiber

Prev 1 2 3

…

9 10 Next