3

Is there a way to identify similar noun phrases. Some suggest use pattern-based approaches, for example X as Y expressions:

Usain Bolt as Sprint King

Liverpool as Reds

Community
  • 1
  • 1
Shimak
  • 89
  • 1
  • 7

1 Answers1

2

There are many techniques to find alternative names for a given entity, using patterns such as:

  • X also known as Y
  • X also titled as Y

and scanning large collections of documents (e.g., Wikipedia or news papers articles) is one way to do it.

There are also other alternatives, one I remember is using Wikipedia inter-links structure, for instance, by exploring the redirect links between articles. You can download a file with a list of redirects from here: https://wiki.dbpedia.org/Downloads2015-04 and exploring the file you can find alternative names/synonyms for entities, e.g.:

  • Kennedy_Centre -> John_F._Kennedy_Center_for_the_Performing_Arts>
  • Lord_Alton_of_Liverpool -> David_Alton,_Baron_Alton_of_Liverpool
  • Indiana_jones_2 -> Indiana_Jones_and_the_Temple_of_Doom

Another thing you can do is combine these two techniques, for instance, look for text segments where both Indiana Jones and Indiana_Jones_and_the_Temple_of_Doom occur and are not further apart more than, let's say, 4 or 5 tokens. You might find patterns like also titled as, then you can use these patterns to find more synonyms/alternative names.

David Batista
  • 3,029
  • 2
  • 23
  • 42
  • Is it possible to use pattern approach in twitter or facebook post collection like retrieving data similar to keyword and search through the collection. – Shimak Oct 28 '18 at 22:15
  • 1
    The patterns you mention above are known as "Hearst Patterns". A paper that automatically finds and evaluates similar patterns is "Learning syntactic patterns for automatic hypernym discovery" - traditionally these patterns were for finding hypernyms, but with a little change they work for synonyms too. https://papers.nips.cc/paper/2659-learning-syntactic-patterns-for-automatic-hypernym-discovery – polm23 Dec 28 '18 at 10:01