I am using spaCy in order to match text against certain dependency patterns. I'm facing the problem that my DependencyParser gives different results even in simple sentences when a single word (of same ground-true POS-Tag) is changed. E.g. 'The baker and supervisor support the baking' finds that 'support' is a VERB, 'baker' and 'supervisor' are NOUNS. 'baker' and 'supervisor' have nsubj dependency to support, 'baking' is dobj of support. See here. Now changing this to 'The baker and oven support the baking' results in an ADV POS-Tag for 'oven' instead of NOUN and has dependency advmod to 'support'. See here. This makes absolutely no sense as oven is never an adverb. I thought that the DependencyParser probably uses the POS-Tags and that changing them could change the resulting dependencies.
I found this question [3] and managed to extract the probabilities of all POS-Tags for each token with Tagger.model.predict([doc])
which delivers a matrix of shape len(doc) x len(tagger.labels). In the first sentence 'supervisor' got 99,8 % for NOUN, while 'oven' only got 62 % for ADV. So I produced multiple docs for the same text, where I changed the Token.pos_ to the second and third most probable candidates if there is uncertainty (most probable tag < 90 %). I then ran the DependencyMatcher with all three docs, thinking that the different POS-Tags would lead to a change in dependencies, but it doesn't. When 'oven' has POS-Tag NOUN (third most probable tag) it still has advmod dependency with 'support', which doesn't make sense.
So similar to [3] I want to inspect the probabilities of all values for Token.dep_ for each token in the doc. Unfortunately, DependencyParser.model.predict([doc])
doesn't deliver this (afaik).
Asked
Active
Viewed 80 times
2

Lukas
- 21
- 1