Questions tagged [drop-duplicates]

questions related to removing (or dropping) unwanted duplicate values

A duplicate is any re-occurrence of an item in a collection. This can be as simple as two identical strings in a list of strings, or multiple complex objects which are treated as the same object when compared to each other.

This tag may pertain to questions about removing unwanted duplicates.

Compare two pandas data frame from csv

I have 2 csv files and i need to compare them using by pandas. The values in these two files are the same so I expect the df result to be empty but it shows to me they are different. Do you think i miss something when i read csv files? or another…

pandas concatenation drop-duplicates

asked May 18 '20 at 19:55

FATMA BERRAK GUNGOR

votes

1 answer

Pandas drop_duplicates() not working after add a row to DataFrame when read from a csv file

My code like below: indexing_file_path = 'indexing.csv' if not os.path.exists(indexing_file_path): df = pd.DataFrame([['1111', '20200101', '20200101'], ['1112', '20200101', '20200101'], ['1113',…

python pandas csv drop-duplicates

asked May 13 '20 at 03:16

fish

2,173
2
13
18

votes

1 answer

How to df.drop_columns() but store values of one colums as list

at the moment I am working on some data have a problem with some duplicates. Here my problem in detail: I have the DF: Col1 Col2 Col3 'aa1' 'bb1' 'cc1' 'aa2' 'bb2' 'cc2' 'aa1' 'bb3' 'cc3' I can simply use…

python pandas drop-duplicates

asked May 07 '20 at 08:28

Pet

votes

2 answers

Eliminate duplicates in MongoDB with a specific sort

I have a database composed by entries which correspond to work contracts. In the MongoDB database I have aggregated by specific worker, then the database - in a simplified version - looks like something like that. { "_id" :…

mongodb drop-duplicates

asked Apr 29 '20 at 17:50

Nicola Caravaggio

votes

1 answer

how to drop duplicates after merging two dataframes?

I have two dataframes , A= ID compponent weight 12 Cap 0.4 12 Pump 183 12 label 0.05 14 cap 0.6 B= ID compponent_B weight_B 12 Cap_B 0.7 12 Pump_B 189 12 label 0.05 when i do merge of this two…

python pandas dataframe merge drop-duplicates

asked Apr 16 '20 at 11:48

chero

votes

0 answers

Pandas drop_duplicates only possible after to_csv and read_csv

I got two Data Frames which I combine and they definitely have duplicates as shown later: total_scrobbles = total_scrobbles.append(new_scrobbles) After that the drop_duplicates Function doesnt do anything. Not a single row is…

python pandas drop-duplicates

asked Mar 26 '20 at 17:23

thepic

votes

1 answer

Python: Remove Duplicates From List of Dicts Based on DateTime Key

I want to reduce this list of dictionaries to take the most current record of the duplicates, where duplicates are determined by same project_name and same feature_group_name. How do I go about doing that? The way I'm doing it right now is as…

python list dictionary drop-duplicates

asked Feb 10 '20 at 21:34

Riley Hun

2,541
5
31
77

votes

1 answer

Python pandas drop_duplicates inserts unnecessary " which lead to csv loading error

in my project I am loading every other day data from Twitter an append it to a csv file. This procedure leads to exact duplicates of tweets in my csv file. That's why I want to remove these exact duplicates. However, when I run the following…

python pandas csv drop-duplicates

asked Dec 24 '19 at 13:55

BennyDShamrock

votes

1 answer

python: drop_duplicates(subset='col_name', inplace=True), why some of the rows can not be dropped?

I'm going to drop duplicates by one of the columns, but some of the rows can be dropped. the wired thing is: if i read the 2 files directly instead of by my func1, func2, then apply the drop function, every thing is fine! update1: highly like is the…

python pandas subset drop-duplicates

asked Dec 02 '19 at 08:52

Sean.H

votes

3 answers

Groupby to create a list

I am using JupyterLab to print some data in a spreadsheet in a specific way. I have two different files: 1) 2) For every original_id == id I want to group by country and list the brands and summing and listing the holding for each brand. The…

python pandas dataframe pandas-groupby drop-duplicates

asked Nov 14 '19 at 10:00

Emanuela Masucci

votes

3 answers

Remove repeated rows with inverted values

I have the following dataframe: print(df) col_1 col_2 A B B A A C I would like to remove the duplicated rows, with inverted values, obtaining: print(df_final) col_1 col_2 A …

pandas dataframe drop-duplicates

asked Oct 15 '19 at 16:12

Alessandro Ceccarelli

1,775
5
21
41

votes

2 answers

Pandas Drop Specified Duplicates After Concat

I'm trying to write a python script that concats two csv files and then drops the duplicate rows. Here is an example of the csv's I'm concating: csv_1 type state city date estimate id lux tx dal 2019/08/15 .8273452 …

python pandas concatenation drop-duplicates

asked Aug 15 '19 at 21:28

JMV12

votes

1 answer

Pyspark dataframe not dropping all duplicates

I am stuck on what seems to be a simple problem, but I can't see what I'm doing wrong, or why the expected behavior of .dropDuplicates() is not working. a variable I use: print type(pk) print pk ('column1', 'column4') I have a…

python pyspark drop-duplicates

asked Apr 19 '19 at 12:39

nojohnny101

votes

2 answers

How to drop_duplicate using different condition per group?

I have dataFrame and I need to drop duplicates per group ('col1') based on a minimum value in another column 'abs(col1 - col2)', but I need to change this condition for the last group by taking the max value in 'abs(col1 - col2)' that corresponding…

python pandas grouping drop-duplicates

asked Mar 28 '19 at 03:31

Sidhom

votes

1 answer

drop duplicates isnt working on my imported csv file

Looking for some help on this one. I do not know why but drop duplicates is not working, tried a loop with lambda. still nothing I can do will remove mutlple duplicates on the output. # Import files for use in the program: import pandas as…

pandas drop-duplicates

asked Oct 11 '18 at 04:43

ahhdioguy

Prev 1 2 3

…

10 Next