I have a big DataFrame (~1 Milion lines) and i need to delete some rows based on the unique identifier Trade_Id. I have the content of this rows (45000 on my test database) on another DataFrame variable called tib. My approach is this one
lentib=len(tib)
for i in range(0,lentib,1): # VERY SLOW
dat=dat[dat.Trade_Id!=tib.Trade_Id[i]]
But the problem is that it is very slow and doing dat[dat.Trade_Id!=tib.Trade_Id]
does not work.
Someone have a better idea in order to be more computationally efficient? I have other databases like this one to work with and I would not like to be two days computing this.