0

df looks like this:

description and keybenefits (14) brand_cooltouch (1711) brand_easylogic (1712)
Lorem Ipsum cooltouch Lorem Ipsum
Lorem Ipsum easylogic Lorem Ipsum
Lorem Ipsum Lorem Ipsum

What I want:

  • When column description and keybenefits (14) contains the value 'cooltouch' column brand_cooltouch (1711) needs to be set to value 1 (int).
  • When column description and keybenefits (14) contains the value 'easylogic' column brand_easylogic (1712) needs to be set to value 1 (int).

Output that I want:

description and keybenefits (14) brand_cooltouch (1711) brand_easylogic (1712)
Lorem Ipsum cooltouch Lorem Ipsum 1
Lorem Ipsum Lorem Ipsum easylogic 1
Lorem Ipsum Lorem Ipsum

Any help is very much appreciated.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Isabella
  • 3
  • 1

3 Answers3

1

One can use pandas.Series.str.contains.

For the string cooltouch do the following

df['brand_cooltouch (1711)'] = df['description and keybenefits (14)'].str.contains('cooltouch', case=False).astype(int)

[Out]:

    description and keybenefits (14)  brand_cooltouch (1711)  brand_easylogic (1712)
0  Lorem Ipsum cooltouch Lorem Ipsum                       1                     None
1  Lorem Ipsum easylogic Lorem Ipsum                       0                     None
2            Lorem Ipsum Lorem Ipsum                       0                     None

For the string easylogic, do the following

df['brand_easylogic (1712)'] = df['description and keybenefits (14)'].str.contains('easylogic', case=False).astype(int)

[Out]:

    description and keybenefits (14)  brand_cooltouch (1711)  brand_easylogic (1712)
0  Lorem Ipsum cooltouch Lorem Ipsum                       1                     0
1  Lorem Ipsum easylogic Lorem Ipsum                       0                     1
2            Lorem Ipsum Lorem Ipsum                       0                     0

Notes:

  • case=False is to make it case insensitive.
Gonçalo Peres
  • 11,752
  • 3
  • 54
  • 83
  • @Isabella didn't understand your goal... If the goal is to replace `None` with `0`, one might want to [check this](https://stackoverflow.com/q/23743460/7109869). – Gonçalo Peres Oct 19 '22 at 09:46
  • Alternatively, if that doesn't answer your question, and assuming it doesn't exist yet, I would recommend asking a new question. Most likely you will get the needed help from the community. – Gonçalo Peres Oct 19 '22 at 09:46
0

you can use np.where. I'd suggest to fill all cells where the condition is not met with NaN or 0. Here is a solution using np.nan

df["brand_cooltouch (1711)“] = np.where(df["description and keybenefits (14)“].str.contains("cooltouch"), 1, np.nan)
df["brand_easylogic (1712)“] = np.where(df["description and keybenefits (14)“].str.contains("easylogic"), 1, np.nan)
TiTo
  • 833
  • 2
  • 7
  • 28
-1

Use Series.str.contains -

df['brand_cooltouch (1711)'] = df['description and keybenefits (14)'].str.contains("cooltouch").astype(int)

Output

    description and keybenefits (14)  brand_cooltouch (1711)  brand_easylogic (1712)
0  Lorem Ipsum cooltouch Lorem Ipsum                       1                     NaN
1  Lorem Ipsum easylogic Lorem Ipsum                       0                     NaN
2            Lorem Ipsum Lorem Ipsum                       0                     NaN

If you do not wish the resulting column to be 1's and 0's - you could also do something like -

df.loc[df['description and keybenefits (14)'].str.contains("cooltouch"), ['brand_cooltouch (1711)']] = '1'
df.loc[~df['description and keybenefits (14)'].str.contains("cooltouch"), ['brand_cooltouch (1711)']] = ''

Output

    description and keybenefits (14) brand_cooltouch (1711)  brand_easylogic (1712)
0  Lorem Ipsum cooltouch Lorem Ipsum                      1                     NaN
1  Lorem Ipsum easylogic Lorem Ipsum                                            NaN
2            Lorem Ipsum Lorem Ipsum                                            NaN
Mortz
  • 4,654
  • 1
  • 19
  • 35