Here's excerpt from a (supposedly) funny review of a restaurant:
I'd like to personally shake Mr Tofu's hand. While I cannot medically prove it, I am 100% certain that their soondubu contains undefined healing properties. Some how some way, I always feel better after a meal here. Got a cold? Screw the Nyquil and get the spicy kimchi soondubu.
I'd would like to extract important entities and link them to Wikipedia entities. I've trained spaCy on a small sample of Wikipedia/WikiData and run entity linking on the review:
[('Tofu', 'PERSON', 'Q177378'),
('Nyquil', 'WORK_OF_ART', 'NIL')]
I'd like other entities to be extracted and linked as well, e.g.:
kimchi -> Kimchi
cold -> Common cold
healing -> medicine
medically -> medicine
It looks like spaCy can link only named entities. I've tried to explicitly list other entities as named (which obviously does not scale well):
ruler = EntityRuler(nlp)
patterns = [{"label": "ORG", "pattern": "kimchi"}, {"label": "ORG", "pattern": "cold"}]
ruler.add_patterns(patterns)
nlp.add_pipe(ruler)
However, spaCy does not seem to link new entities at all:
[ ('Tofu', 'PERSON', 'Q177378'),
('cold', 'ORG', ''),
('Nyquil', 'WORK_OF_ART', 'NIL'),
('kimchi', 'ORG', '')]
- How can I make Spacy recognize also other entities?
- Should this be done before training entity linking model or can be done with already trained model?
- Is spaCy the right tool for my task at all?