1

I write a code to find the POS for Arabic words in my python shell 2.7 and the output was not correct , i find this solution on stackoverflow : Unknown symbol in nltk pos tagging for Arabic

and i download all the files needed (stanford-postagger-full-2018-02-27) this file used in the code in the problem above .

this code from above problem and i write it in my shell:

    # -*- coding: cp1256 -*-

   from nltk.tag import pos_tag
   from nltk.tag.stanford import POSTagger
   from nltk.data import load
   from nltk.tokenize import word_tokenize
   _POS_TAGGER = 'taggers/maxent_treebank_pos_tagger/english.pickle'
   def pos_tag(tokens):
      tagger = load(_POS_TAGGER)
      return tagger.tag(tokens)
   path_to_model= 'D:\StanfordParser\stanford-postagger-full-2018-02-
   27\models/arabic.tagger'
   path_to_jar = 'D:\StanfordParser\stanford-postagger-full-2018-02-
   27/stanford-postagger-3.9.1.jar'

   artagger = POSTagger(path_to_model, path_to_jar, encoding='utf8')
   artagger._SEPARATOR = '/'
   tagged_sent = artagger.tag(u"أنا تسلق شجرة")
   print(tagged_sent)

and the output was :

    Traceback (most recent call last):
    File "C:/Python27/Lib/mo.py", line 4, in <module>
    from nltk.tag.stanford import POSTagger
    ImportError: cannot import name POSTagger

How can I solve this error ?

  • Stanford POSTagger is missing on your machine, check this question: https://stackoverflow.com/questions/13883277/stanford-parser-and-nltk – Abdulrahman Bres Mar 02 '18 at 17:58
  • Check Qutuf: http://qutuf.com Web service available at: https://qutuf.herokuapp.com Code available at: https://github.com/Qutuf/Qutuf – Muhammad Altabba Jun 16 '19 at 16:13

1 Answers1

0

This script works without errors on my PC, but the tagger results do not look very good!!!

import nltk
from nltk import *
from nltk.tag.stanford import StanfordTagger
import os

java_path = "Put your local path in here/Java/javapath/java.exe"
os.environ['JAVAHOME'] = java_path

path_to_model= ('Put your local path in here/stanford-postagger-full-2017-06-09/models/arabic.tagger')

path_to_jar = ('Put your local path in here/stanford-postagger-full-2017-06-09/stanford-postagger.jar')

artagger = StanfordPOSTagger(path_to_model, path_to_jar, encoding='utf8')

artagger._SEPARATOR = "/"

tagged_sent = artagger.tag("أنا أتسلق شجرة".split())

print(tagged_sent)

The results: [('أنا', 'VBD'), ('أتسلق', 'NN'), ('شجرة', 'NN')]

Give it a try and see :-)

Azz
  • 301
  • 3
  • 8