1

I am trying to create a machine learning model in WKS and am currently in the process of annotating documents. I want the model to extract address entities. My broader goal is to understand an author's intent to switch their mailing address from an old address to a new one. The challenge is that there will be two or more mentions of an address in the text and the model needs to distinguish between the two. I have seen examples where each piece of the address is treated as a discrete entity

I.E.

  • [735] [Airport Rd], [Bismarck], [ND] [58504] entities: street number, street name, city, state, zip

-VS-

  • treating the entire address as one entity [735 Airport Rd, Bismarck, ND 58504] entity: address

the reason I would want to treat the entire address as one entity is because I need the model to distinguish between the old address and the new address I believe if I treat an address as one entity then I can use the relationship between the identifying clause such as:

  • new address: [new_address] or, the new address is [new_address]

Has anyone tried to do something similar in WKS or with another NLP tool? Is it possible to treat each piece of the address as an entity and define a relationship between each piece of the address and old_address/new_address respectively?

Brian McCann
  • 141
  • 1
  • 4

1 Answers1

1

You may be able to define Address entity type and annotates multiple tokens as an address mention. WKS does not restrict a mention to a single token (but too long mention annotation is not recommended)

sgnk
  • 83
  • 3