I have a DF which is quite big. A snippet like the one shown below.
SrNo | Merchant | Revenue | Currency
1 | UBER SR | 123 | INR
2 | UBER (SR)| 123 | INR
3 | SR UBER | 123 | INR
4 | ZOMATO SR| 123 | INR
5 | ZOMATOSR | 123 | INR
6 |12FLIPAKRT| 123 | INR
7 | FLIPKART | 123 | INR
My Output should look like:
SrNo | Merchant | Revenue | Currency |Merchant_Flag
1 | UBER SR | 123 | INR | UBER
2 | UBER (SR)| 123 | INR | UBER
3 | SR UBER | 123 | INR | UBER
4 | ZOMATO SR| 123 | INR | ZOMATO
5 | ZOMATOSR | 123 | INR | ZOMATO
6 |12FLIPAKRT| 123 | INR | FLIPKART
7 | FLIPKART | 123 | INR | FLIPKART
Explanation : I want to add an additional column which should have values wrt to Merchant column i.e. if the Merchant column value has UBER in it, Merchant_Flag should be UBER and likewise for other ZOMATO, FLIPKART.
My Dataset is huge. I tried using re.search and then .replace using if and else for my conditions, it is giving me performance issue. Another solution, I tried was using .loc
df.loc[df['columnname'].str.contains('')]
. Not sure how to proceed. Can someone help on this.