I am working on Entity Extraction in R.
I have a UniqueID
and Text
field - need to extract location information from the text field.
My Text field has description with location names
text <- c("SERANGOON JC","Blk 4","SHELL TAMPINES AVE 4","SENOKO INDUSTRIAL ESTATE","Senoko Estate","Senoko","senok Est.")
I have a list of Locations ;
Loc <- c("SERANGOON JUNIOR COLLEGE","Block 4","SHELL TAMPINES AVENUE 4","SENOKO INDUSTRIAL ESTATE")
Need to match the loc
and extract those location from the text
field.In the text field SENOKO INDUSTRIAL ESTATE
is spelt in different ways Senoko Estate
or Senoko
(Half Names) or with spelling mistake senok Est.
.for all the above mis-spelt and half spelt words - i need to get the exact name from loc
ie. SENOKO INDUSTRIAL ESTATE
.
My output would look like:(Extract location from Text field -get correct words for half- spelt and misspelt words)
ID Location
123 SERANGOON JUNIOR COLLEGE|Block 4|SHELL TAMPINES AVENUE 4|SENOKO INDUSTRIAL ESTATE|SENOKO INDUSTRIAL ESTATE|SENOKO INDUSTRIAL ESTATE|SENOKO INDUSTRIAL ESTATE