0

I am trying to count how many times each word appear in whole corpus.
But i am getting the error :

 corpus_root = os.path.abspath('../nlp_urdu/out1_data')
    mycorpus = nltk.corpus.reader.TaggedCorpusReader(corpus_root,'.*')
    noun=[]
    count_freq = defaultdict(int)
    for infile in (mycorpus.fileids()):
        print(infile)
    for i in (mycorpus.tagged_sents()):
         texts = [word for word, pos in i  if (pos == 'NN' )]
         noun.append(texts)  
         count_freq[noun]+= 1
         print(count_freq)

error which i am getting is :

count_freq[noun]+= 1

TypeError: unhashable type: 'list'

Community
  • 1
  • 1
user3778289
  • 323
  • 4
  • 18

1 Answers1

0

texts is a list of noun
count_freq is a dict with each key must must a noun (a string)

corpus_root = os.path.abspath('../nlp_urdu/out1_data')
    mycorpus = nltk.corpus.reader.TaggedCorpusReader(corpus_root,'.*')
    count_freq = defaultdict(int)
    for infile in (mycorpus.fileids()):
        print(infile)
    for i in (mycorpus.tagged_sents()):
         texts = [word for word, pos in i  if (pos == 'NN' )]
         for noun in texts :             
             count_freq[noun]+= 1

    print(count_freq)
Indent
  • 4,675
  • 1
  • 19
  • 35
  • actually i just used words which are noun and those words are in "texts"and then appended the whole nouns from the corpus in in list named as noun. – user3778289 Oct 26 '17 at 18:20
  • this is not showing the correct output.there are 13000 nouns.it is just repeating the one file – user3778289 Oct 26 '17 at 18:25