0

I am looking for a means of identifying UK University names mentioned in Tweet text.

I have a list of full University names, but the issue is dealing with shortened versions such as "aber uni" (Aberystwyth Uni), "staffs uni" (Staffordshire University) or "portsmouth" (University of Portsmouth).

I have looked down the route of Apache Stanbol and OpenNLP to attempt Named Entity Recognition, and although these will match for the full names I cannot seem to find a means of training them to identify variations of the names (or indeed lowercase versions of the name which are not identified).

James
  • 101
  • 4

1 Answers1

0

Gather a list of universities (which is easy to do) and scrape the list of names for each university from Freebase: What is one way to find related names using the web?

Community
  • 1
  • 1
Daniel
  • 5,839
  • 9
  • 46
  • 85