Is there some way to recognize that a word is likely to be/is not likely to be a person's name?
So if I see the word "understanding" I would get a probability of 0.01, whereas the word "Johnson" would return a probability of 0.99, while a word like Smith would return 0.75 and a word like Apple 0.15.
Is there any way to do this?
The goal is, if someone searches for, say Charles Darwin galapagos
, the search engine guesses that it should search the author field for Charles
and Darwin
and the title and abstract fields for galapagos
.