Identify Indian names in a given string of combined name tokens

Question

I do have a set of different name tokens and also data where the different names are combined. Eg. If the name has 3 tokens like "abc def ghi" and given a name "abcdef" or "abcdefghi", I would like to identify different valid tokens of that combined name string. Can we build a dictionary of name tokens and use some NLP techniques or python libraries to achieve this? Please give your inputs on how to start.

Please consider including an [mcve] with actual examples, as suggested by @DYZ . Also provide any current code or approach that you are using for now. — dennlinger, Jan 31 '20 at 10:32

score 0 · Answer 1 · answered Jan 31 '20 at 06:51

If you need to find a substring in a string, all you need is a list of tokens and a loop:

tokens = ['abc', 'def', 'ghi']
name = 'abcdef'
for token in tokens:
    if token in name:
        print(token, 'is part of', name)

See also if you need to find the position of the substring within the string.

Identify Indian names in a given string of combined name tokens

1 Answers1