How to get indices of words in a Spacy dependency parse?

Question

I am trying to use Spacy to extract word relations/dependencies, but am a little unsure about how to use the information it gives me. I understand how to generate the visual dependency tree for debugging.

Specifically, I don’t see a way to map the list of children of a token to a specific token. There is no index—just a list of words.

Looking at the example here: https://spacy.io/usage/linguistic-features#dependency-parse

nlp("Autonomous cars shift insurance liability toward manufacturers")

Also, if the sentence were nlp("Autonomous cars shift insurance liability toward manufacturers of cars”), how would I disambiguate between the two instances of cars?

The only thing I can think of is that maybe these tokens are actually reference types that I can map to indices myself. Is that the case?

Basically, I am looking to start with getting the predicates and args to understand “who did what to whom and how/using what”.

score 2 · Accepted Answer · answered Jun 25 '20 at 21:54

2

Yeah, when you print a token it looks like a string. It’s not. It’s an object with tons of metadata, including token.i which is the index you are looking for.

If you’re just getting started with spaCy, the best use of your time is the course, it’s quick and practical.

answered Jun 25 '20 at 21:54

Sam H.

4,091
3
26
34

Ah, I figured the index must have existed, but somehow I couldn't find the relevant part of the API reference. This is a bit tangential, but is Space actually a really good tool in comparison with Stanford's corenlp and more recent stanza, or even Google's NLP? I'm trying to get informations about relations between objects/entities, what they do to what, causal and conditional relationships, etc. Specifically, I'd like to be able to get information about sentences like "When x happens, Y happens" or "When I was a boy, I played with blocks" -- and extract the logic. Is this doable with Space? – synchronizer Jun 26 '20 at 00:17
@synchronizer did this answer the question? You have enough points that I assume you know it’s kinda rude to not accept or upvote a correct answer. To your added questions: spaCy is pretty great. I wouldn’t touch coreNLP, the UX is too rough. Stanza is really powerful, but trades convenience for power. I would start with spaCy, then move to stanza if I needed that extra few % accuracy. There is a library that makes stanza models available in spaCy though, which is how I’d use Stanza. GCP entity extraction is very good, but I haven’t tried configuring it – Sam H. Jun 26 '20 at 15:19
Whoops, sorry about that. I upvoted/checked. I was hoping to figure out a specific way to traverse the dependency tree though. The site recommends going in order of the words, but I’m not sure where that would get me necessarily. Maybe this should be a separate question. I’d like extract predicates and so on for cause and effect relations (If x happens, when x happens), but maybe this could be a set of hard-coded rules. I’m trying to find more examples. – synchronizer Jun 26 '20 at 16:38
So you are hoping to extract clauses from "if...then..." type statements, even if they aren't framed as "if...then..." e.g. "Autonomous cars shift insurance liability toward manufacturers" <--> "If there are autonomous cars, then insurance liability shifts towards manufacturers"... and you want to extract ("there are automatic cars", "insurance liability shifts towards manufacturers")? – Sam H. Jun 26 '20 at 17:55
Or maybe you just meant you want to extract the agent, action, and patient? e.g. "Autonomous cars shift insurance liability toward manufacturers" -> ("autonomous cars", "shift", "liability") or ("autonomous cars", "shift", "liability towards manufacturers") ? Regardless, i feel like that would be a separate question. SO works better when questions are focused, and this question is "How to get indices of words in a Spacy dependency parse?" – Sam H. Jun 26 '20 at 18:01
Right, the second one is correct. I can just ask that question the way you worded it right now. I'll go through the tutorial for sure though. – synchronizer Jun 26 '20 at 19:23
Thank you for re-articulating the follow-up question. Here is the post, in case you also know how I might approach these problems: https://stackoverflow.com/questions/62601716/in-spacy-nlp-how-extract-the-agent-action-and-patient-as-well-as-cause-eff – synchronizer Jun 26 '20 at 19:36

How to get indices of words in a Spacy dependency parse?

1 Answers1