3

Let us say I have a dataframe :

first_df = pd.DataFrame({"company" : ['abc','def','xyz','lmn','def','xyz'], 
                                "art_type": ['300x240','100x600','400x600','300x240','100x600','400x600'],
                                "metrics" : ['imp','rev','cpm','imp','rev','cpm'],
                                "value": [1234,23,0.5,1234,23,0.5]})
first_df = first_df.append(first_df)

I want to remove all the rows which have a value for company in the list ['lmn','xyz'] and store that in another dataframe.

company_list = ['lmn', 'xyz']

I tried this :

deleted_data = first_df[first_df['company'] in company_list] 

this obviously did not work because it is list in list. Is for loop the way to do this or is there any better way to do it?

for loop code :

deleted_data = pd.DataFrame()
for x in company_list:
    deleted_data = deleted_data.append(first_df[first_df['company']==x])
Alexander
  • 105,104
  • 32
  • 201
  • 196
Data Enthusiast
  • 521
  • 4
  • 12
  • 22

1 Answers1

3

You can filter based on isin().

deleted_data = first_df.loc[first_df['company'].isin(company_list)]
>>> deleted_data 
  art_type company metrics   value
2  400x600     xyz     cpm     0.5
3  300x240     lmn     imp  1234.0
5  400x600     xyz     cpm     0.5
2  400x600     xyz     cpm     0.5
3  300x240     lmn     imp  1234.0
5  400x600     xyz     cpm     0.5

retained_data = first_df.loc[~first_df['company'].isin(company_list)]
>>> retained_data
  art_type company metrics  value
0  300x240     abc     imp   1234
1  100x600     def     rev     23
4  100x600     def     rev     23
0  300x240     abc     imp   1234
1  100x600     def     rev     23
4  100x600     def     rev     23
Alexander
  • 105,104
  • 32
  • 201
  • 196