1

I am seeking to drop some rows from a DataFrame based on two conditions needing to be met in the same row. So I have 5 columns, in which; if two columns have equal values (code1 and code2) AND one other column (count) is greater than 1, then when these two conditions are met in the same row - the column is dropped.

I could alternatively keep columns that meet the conditions of:

count == 1 'OR' (as opposed to AND) df_1.code1 != df_1.code2

In terms of the first idea what I am thinking is:

df_1 = '''drop row if''' [df_1.count == 1 & df_1.code1 == df_1.code2] 

Here is what I have so far in terms of the second idea;

df_1 = df_1[df_1.count == 1 or df_1.code1 != df_1.code2] 
Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
StringTheo
  • 23
  • 1
  • 5

2 Answers2

2

You can use .loc to specify multiple conditions.

df_new = df_1.loc[(df_1.count != 1) & (df_1.code1 != df_1.code2), :]
Alexander
  • 105,104
  • 32
  • 201
  • 196
2
df.drop(df[(df['code1'] == df['code2']) & (df['count'] > 1)].index, inplace=True)

Breaking it to steps:

df[(df['code1'] == df['code2']) & (df['count'] > 1)] returns a subset of rows from df where the value in code1 equals to the value in code2 and the value in count is greater than 1.

.index returns the indexes of those rows.

The last step is calling df.drop() that expects indexes to be dropped from the dataframe, and using inplace=True so we won't need to re-assign, ie df = df.drop(...).

DeepSpace
  • 78,697
  • 11
  • 109
  • 154