Validate Initials Pandas Python

Question

Can someone help to validate a column. It's about a 'Initials' column.

Contact	Initials
1	P.J.
2	Peter
3	P.

An Initial exist of one letter and then a point. So like rows one and three. Row 2 is false.

I hope someone can help.

jezrael · Accepted Answer · 2021-07-13T05:35:33.527

1

Use Series.str.contains for test uppercase with dot:

print (df)
   Contact Initials
0        1     P.J.
1        2    Peter
2        3   P.Daa.
3        4       P.
4        5      H..
5        6      J.K

#https://stackoverflow.com/a/17779796/2901002 with ^ for start and $ for end of string
df['test'] = df['Initials'].str.contains(r'^(?:[A-Z]\.)+$')
print (df)
   Contact Initials   test
0        1     P.J.   True
1        2    Peter  False
2        3   P.Daa.  False
3        4       P.   True
4        5      H..  False
5        6      J.K  False

edited Jul 13 '21 at 05:35

answered Jul 12 '21 at 13:37

jezrael

822,522
95
1,334
1,252

For example Initials H.. and J.K gives True as value. But they are False because H.. must have one dot and J.K has no dot at the end. Do you know how i can validate please? – Monkey D Jul 12 '21 at 18:24
Thanks!!! I don't know if i can keep asking question. But the challenge i now have is to make it clean. So everything needs to be like row one. Row 2 don't have to be like P. because thats only possible with machine learning i guess. – Monkey D Jul 13 '21 at 07:39
@MonkeyD - I think not so easy, best create new question. – jezrael Jul 13 '21 at 07:40

score 0 · Answer 2 · answered Jul 12 '21 at 13:36

You could consider writing a custom function to check for your two conditions:

def validate(string):
    if not string[0].isalpha():
        return False
    if not string[1] == ".":
        return False
    return True

Then apply it to the columns like so:

>>> df["Initials"].apply(validate)
0     True
1    False
2     True

Validate Initials Pandas Python

2 Answers2