Questions tagged [drop-duplicates]

questions related to removing (or dropping) unwanted duplicate values

A duplicate is any re-occurrence of an item in a collection. This can be as simple as two identical strings in a list of strings, or multiple complex objects which are treated as the same object when compared to each other.

This tag may pertain to questions about removing unwanted duplicates.

Pandas drop_duplicates() without empty rows

I have 2 equal columns in a pandas data frame. Each of the columns have the same duplicates. A B 1 1 1 1 2 2 3 3 3 3 4 4 4 4 I want to delete the duplicates only from column B so that the goal is like the following: A B 1 1 1 2 2 3 3 4 3 4 4 I…

asked Mar 03 '23 at 08:31

David95

vote

1 answer

Why is unique() not picking up some identical rows?

I am working with a large dataset of event logs that looks something like this: time user_id place key version 2023-02-13 06:28:54 30375 School 422i-dmank-ev2eia 2.023 2023-02-13 06:24:42 47127 School wjes-wtpi-byt2rl0 2.023 2023-02-13…

r duplicates drop-duplicates

asked Feb 27 '23 at 07:28

gribbling

vote

1 answer

Python Drop duplicates to ignore case sensitive

I want to delete duplicates from the below df, preserving the case sensitivity. Input df df = pd.DataFrame({ 'company_name': ['Apple', 'apple','apple', 'BlackBerry', 'blackberry','Blackberry'] }) Expected df company_name 0 …

python pandas dataframe drop-duplicates

asked Feb 01 '23 at 19:24

spartacus8w2039

vote

1 answer

Using Python, how do I remove duplicates in a PANDAS dataframe column while keeping/ignoring all 'nan' values?

I have a dataframe like this: import pandas as pd data1 = { "siteID": [1, 2, 3, 1, 2, 'nan', 'nan', 'nan'], "date": [42, 30, 43, 29, 26, 34, 10, 14], } df = pd.DataFrame(data1) But I want to delete any duplicates in siteID, keeping…

python pandas dataframe sorting drop-duplicates

asked Dec 28 '22 at 16:59

Bojan Milinic

vote

2 answers

How not to add duplicate elements in the DOM

I know it is a very easy question but I am still struggling. I created a function which adds element in an Array and after that I am using forEach loop for appending them in the DOM. But I am unable to prevent addition of duplicate elements. const…

javascript arrays drop-duplicates

asked Dec 23 '22 at 17:50

Kunal Tanwar

1,209
1
8
23

vote

4 answers

How to remove duplicate rows with a condition in pandas

i.e i want to drop duplicates pairs using col1 and col2 as the subset only if the values are the opposite in col3 (one negative and one positive). similar to drop_duplicates function but i want to impose a condition and only want to remove the first…

python pandas drop-duplicates

asked Nov 21 '22 at 02:49

bbaba

vote

1 answer

remove-duplicates produces numbers that haven't been in the original list in NetLogo

I'm creating a list out of the patch variable "geb-id" (a 7-digit integer number) with the following line: set geb-id-list [geb-id] of patches with [geb-id >= 0 AND residents != 0] When I look at the produced list, all looks fine. Then I'm…

list netlogo drop-duplicates

asked Aug 16 '22 at 09:10

misch

vote

1 answer

Drop duplicates when for a group a string present more than once in a column-pandas

Is there a way to groupby based on 2 columns (Id, Name) in a dataframe and if the presence of a certain string "x_1" in the column "Name" is more than once, then just keep the first row (first occurrence)? Id Name Value 1 x_1 23 1 x_2 24 1 x_1 …

python pandas group-by drop-duplicates

asked Jul 20 '22 at 12:49

user7675621

vote

1 answer

Pandas drop_duplicates returns NoneType when inplace=True and doesn't drop duplicates when inplace=False

I want to remove duplicates from certain dataframe. When I have inplace=True it in fact removes the duplicates and returns NoneType dataframe. When I set inplace=False the dataframe is not modified even when I assign new variable for it. This works…

python pandas drop-duplicates

asked Jun 07 '22 at 10:28

Jaxsss

vote

1 answer

Pandas dataframe drop duplicates based in another column value

I have a dataframe with duplicates: timestamp id ch is_eval. c 12. 1. 1. False. 2 13. 1. 0. False. 1 12. 1. 1. True. 4 13. 1 0. False. 3 When there are duplicated, it is always when I want to drop_duplicates with…

python pandas dataframe data-munging drop-duplicates

asked Jun 01 '22 at 07:26

Cranjis

1,590
8
31
64

vote

1 answer

Want to drop duplicate based on one column but want to keep first two rows

Hi I am droping duplicate from dataframe based on one column i.e "ID", Till now i am droping the duplicate and keeping the first occurence but I want to keep the first(top) two occurrence instead of only one. So I can compare the values of first two…

python pandas dataframe pandas-groupby drop-duplicates

asked May 12 '22 at 11:54

sandy

vote

1 answer

pandas drop_duplicates works but when saved using .to_csv it still shows all

I'm simply trying to remove duplicates from a csv and then make a new csv file with only the first column and no duplicates. My terminal shows its working but when then the new csv file still shows all. ??? import pandas as pd import numpy as…

python pandas drop-duplicates

asked Apr 01 '22 at 18:30

Pheng Vue

vote

1 answer

drop_duplicates in pandas for a large data set

I am new to pandas so sorry for naiveté. I have two dataframe. One is out.hdf: 999999 2014 1 2 15 19 45.19 14.095 -91.528 69.7 4.5 0.0 0.0 0.0 603879074 999999 2014 1 2 23 53 57.58 16.128 -97.815 23.2 4.8 0.0 0.0 0.0…

python pandas dataframe drop-duplicates

asked Mar 13 '22 at 14:44

Sonia Bazargan

vote

2 answers

Python Pandas : Drop Duplicates Function - Unusual Behaviour

The error -> TypeError: unhashable type: 'list' disappears after saving the data frame and loading it again ... Both data frames [saved and loaded, generated] have the same dtypes ... Reproducible -> --> import pandas as pd --> l1 = [[1], [1], [1],…

python pandas list dataframe drop-duplicates

asked Jan 15 '22 at 05:56

Agnij

vote

2 answers

Drop Duplicate Rows Based on Target Class Conditions

I have a dataset with 3 target classes: ‘Yes’, ‘Maybe’, and ‘No’. Unique_id target 111 Yes 111 Maybe 111 No 112 No 112 Maybe 113 No I want to drop duplicate rows…

python pandas dataframe data-manipulation drop-duplicates

asked Sep 07 '21 at 15:44

Roy

Prev 1 2

…

9 10 Next