Recognition of first and last name as one entity

Question

I am interested in Natural Language processing. I am wondering if there is a good known algorithm that in a text one can determine first and last name as one entity.

For example If we have this:

Last week John Wayne went to Europe.

I want to have a tokenizer that gives: "Last", "Week", John Wayne", "went", "to", "Europe".

Any help is appreciated.

mbatchkarov · Answer 1 · 2019-09-26T08:07:02.970

4

This is an essential part of named entry recognition and most NER algorithms do it out of the box (most of the time). For example, I ran your sentence through the Stanford NER system's web interface and I got:

Last week <PERSON>John Wayne</PERSON> went to <LOCATION>Europe</LOCATION>.

Depending on what algorithm you use, the output may be formatted differently. The most common format is IOB.

edited Sep 26 '19 at 08:07

answered Jun 11 '14 at 08:58

mbatchkarov

15,487
9
60
79

Thanks for the answer. Can you please suggest a few of the NER algorithms that can do this? I am specially interested in none English languages, so I like to know algorithms rather than tools. – TJ1 Jun 11 '14 at 12:55
Do you want to know how NER is done in general or do you want a tool that can do NER for you? – mbatchkarov Jun 11 '14 at 13:12
1

Stanford's NER uses CRF. – Blacksad Jun 11 '14 at 13:24
I am interested in knowing how NER is done especially for none English languages. – TJ1 Jun 19 '14 at 14:01

score 2 · Answer 2 · answered Jun 11 '14 at 07:59

2

If the characters in your text are famous people you can do this:

Run Illinois Wikifier on your text : for example run it on your example : http://cogcomp.cs.illinois.edu/demo/wikify/?id=25
Combine all the words that are linked to the same webpage by the Wikifier; for example in your example the output becomes like this: "Last week John_Wayne went to Europe." You can also save it where the combinations is done.

Now you can do anything with your text, like giving it to a tokenizer!

answered Jun 11 '14 at 07:59

Daniel

5,839
9
46
85

Thanks for the answer. This is a good tool, however I am looking for an algorithm to do so. Doing this in English is relatively easier as first and last names both start with capital letters. I am more intrested in algorithms that can be used for other languages. – TJ1 Jun 11 '14 at 13:04

Recognition of first and last name as one entity

2 Answers2