I have a df
that has a column called EMAIL
, which contains various email addresses. I want to remove all the special characters, specifically ., -, and _ that come before @ and append a new column NEW_EMAIL
. For example, if df['EMAIL'] = 'ab_cd_123@email.com'
, I want df['NEW_EMAIL'] = 'abcd123@email.com'
.
I was able to remove periods successfully with my codes below, but cannot seem to remove underscore or dash in the same line of code. Right now, I am repeating the same line of codes to remove those three special characters, which is quite ugly. Can someone lend me a hand please? Thank you for your help in advance.
df['NEW_EMAIL'] = df.EMAIL.str.replace(r'\.(?!.{1,4}$)','', regex = True)
df['NEW_EMAIL'] = df.NEW_EMAIL.str.replace(r'\.(?!.{1,4}$)','', regex = True)
df['NEW_EMAIL'] = df.NEW_EMAIL.str.replace(r'\.(?!.{1,4}$)','', regex = True)