I am very new to NLP and am looking for topics to explore that may be able to help me in identifying subjects. Specifically, victim and attacker in the following context:
The UK was attacked by China over several weeks
Over several weeks, China attacked the UK.
Using SpaCy, I have identified the subjects, but they change depending on their position:
nlp = spacy.load("en_core_web_sm")
doc1 = nlp("China attacked the UK over several weeks")
doc2 = nlp("The UK was attacked by China over several weeks")
docs = [doc1, doc2]
for doc in docs:
print("============")
for chunk in doc.noun_chunks:
print(chunk.text, chunk.root.text, chunk.root.dep_,
chunk.root.head.text)
Output:
============
China China nsubj attacked
the UK UK dobj attacked
several weeks weeks pobj over
============
The UK UK nsubjpass attacked
China China pobj by
several weeks weeks pobj over
Any help and direction would be greatly appreciated.