NLTK : How to get a specific contents of an array in a loop with python?

Question

is it possible to do the following code with python:

import nltk
from nltk.corpus.reader import TaggedCorpusReader
reader = TaggedCorpusReader('cookbook', r'.*\.pos')
train_sents=reader.tagged_sents()
tags=[]
count=0
for sent in train_sents:
    for (word,tag) in sent:
        #if tag is DTDEF i want to get the tag after it
        if tag=="DTDEF":
            tags[count]=tag[acutalIndex+1]
            count+=1


fd = nltk.FreqDist(tags)
fd.tabulate()

Thank you in advance for your answer and advice.

You should shorten your question title, it's really hard to read and understand what you ask — Higanbana, Jun 14 '19 at 06:15

score 1 · Accepted Answer · answered Jun 14 '19 at 06:23

I'm not 100% sure I understand, but if you're looking to get all the entries in a list after a specific entry, the easiest way would be to do:

foundthing=False
result = []
for i in list:
    if foundthing:
        result.append(i)
    if i == "Thing I'm Looking For":
        foundthing = True

Adding this to your code results in:

import nltk
from nltk.corpus.reader import TaggedCorpusReader
reader = TaggedCorpusReader('cookbook', r'.*\.pos')
train_sents=reader.tagged_sents()
tags = []
foundit=False
for sent in train_sents:
    #i change the line here
    for (word,tag) in nltk.bigrams(sent):
        if foundit: #If the entry is after 'DTDEF'
            tags.append(foundit) #Add it to the resulting list of tags.
        if tag[1]=='DTDEF': #If the entry is 'DTDEF'
            foundit=True #Set the 'After DTDEF' flag.

fd = nltk.FreqDist(tags)
fd.tabulate()

Hope this helps.

Hi, thank you for your answer. The algorithm is right. – Nambi Jun 14 '19 at 06:45 — Nambi, Jun 14 '19 at 06:45

Nambi · Answer 2 · 2019-06-14T07:20:01.060

Thank's for #CrazySqueak for the help, i use his code and edit some part to get this:

import nltk
from nltk.corpus.reader import TaggedCorpusReader
reader = TaggedCorpusReader('cookbook', r'.*\.pos')
train_sents=reader.tagged_sents()
tags = []
foundit=False
for sent in train_sents:
    #i change the line here
    for (word,tag) in nltk.bigrams(sent):
        if foundit: #If the entry is after 'DTDEF'
            tags.append(tag[1]) #Add it to the resulting list of tags, i change
                                #tag [1] instead, if you use only tag, it will 
                                #store not only the tag but the word as well 
            #of foundit
            foundit=False #I need to make it false again, cause it will store again even 
                          #if the tag is != of DTDEF
        if tag[1]=='DTDEF': #If the entry is 'DTDEF'
            foundit=True #Set the 'After DTDEF' flag.

fd = nltk.FreqDist(tags)
fd.tabulate()

Thank you again for you advice and your answer.

NLTK : How to get a specific contents of an array in a loop with python?

2 Answers2