3

I am using malt parser with python nltk. I have successfully downloaded the training data and updated the latest nltk. When I call the malt parser it gives me an asertion error. Below is the code from python which includes the traceback as well.

 mp = MaltParser("C:/Users/mustufain/Desktop/Python Files/maltparser-1.8.1","C:/Users/mustufain/Desktop/Python Files/maltparser-1.7.2",additional_java_args=['-Xmx512m'])

Traceback (most recent call last):
  File "<pyshell#10>", line 1, in <module>
    mp = MaltParser("C:/Users/mustufain/Desktop/Python Files/maltparser-1.8.1","C:/Users/mustufain/Desktop/Python Files/maltparser-1.7.2",additional_java_args=['-Xmx512m'])
  File "C:\Python34\lib\site-packages\nltk\parse\malt.py", line 131, in __init__
    self.malt_jars = find_maltparser(parser_dirname)
  File "C:\Python34\lib\site-packages\nltk\parse\malt.py", line 72, in find_maltparser
    assert malt_dependencies.issubset(_jars)
AssertionError
>>> 
alvas
  • 115,346
  • 109
  • 446
  • 738
Mustufain
  • 198
  • 12
  • Have you setup: https://github.com/nltk/nltk/wiki/Installing-Third-Party-Software#malt-parser ? – alvas Feb 19 '16 at 15:30
  • Do you have ['log4j.jar', 'libsvm.jar', 'liblinear-1.8.jar'] in `C:/Users/mustufain/Desktop/Python Files/maltparser-1.8.1`? – alvas Feb 19 '16 at 15:32
  • What is the output of `dir C:/Users/mustufain/Desktop/Python Files/maltparser-1.8.1/` on the command prompt? – alvas Feb 19 '16 at 15:32
  • BTW, the second parameter is the trained `.mco` file not another malt-parser, see https://github.com/nltk/nltk/blob/develop/nltk/parse/malt.py#L104 – alvas Feb 19 '16 at 15:33
  • Sorry for the multiple comments, please update your question with the above answers to the questions in the comments, and we'll try out best to help you. We need to know these information to better help you with the question. – alvas Feb 19 '16 at 15:34
  • My hunch is that this line is causing the problem change the slash to backslash and it should work: https://github.com/nltk/nltk/blob/develop/nltk/parse/malt.py#L69 – alvas Feb 19 '16 at 15:36
  • Thanks for spotting the bug! – alvas Feb 19 '16 at 15:51
  • @alvas the second parameter is the path to the mco file it is not another malt-parser. plus I also have the log4j.jar,liblinear-1.8.jar and libvsm.jar in the given path – Mustufain Feb 20 '16 at 11:20
  • @alvas output of dir C:/Users/mustufain/Desktop/Python Files/maltparser-1.8.1/ : Invalid switch - "Users" – Mustufain Feb 20 '16 at 13:07
  • Ah ha, silly me... In windows, it's `dir C:\Users\mustufain\Desktop\Python Files\maltparser-1.8.1\` What's the output of that? – alvas Feb 20 '16 at 13:20
  • @alvas it says the system cannot find the file specified – Mustufain Feb 20 '16 at 13:34
  • 1
    @alvas I am not getting an exception now I figured it out as I have changed the name of maltparser-1.8.1.jar to malt.jar file but as I looked through the code of line which was giving me this assertion error I renamed it to maltparser-1.8.1.jar and It went fine – Mustufain Feb 20 '16 at 14:10
  • Please look at http://www.howtogeek.com/181774/why-windows-uses-backslashes-and-everything-else-uses-forward-slashes/ – alvas Feb 22 '16 at 14:18
  • import nltk from nltk.parse import malt parser = nltk.parse.malt.MaltParser(working_dir="D:/Python Files/maltparser-1.7.2",mco="engmalt.poly-1.7.mco",additional_java_args=['-Xmx512m']) txt = "This is a test sentence" graph=parser.raw_parse(txt) – Mustufain Feb 22 '16 at 14:20
  • Put it in the question... -_-||| – alvas Feb 22 '16 at 14:21
  • @alvas I have updated my malt.py file and when I run the above code it gives me this exception : File "C:\Python34\lib\site-packages\nltk\parse\dependencygraph.py", line 378 "The graph doesn't contain a node " UserWarning: The graph doesn't contain a node that depends on the root element. – Mustufain Feb 22 '16 at 14:21
  • TypeError: 'list' object is not an iterator – Mustufain Feb 22 '16 at 14:22
  • Please see new answer! – alvas Feb 22 '16 at 14:25

2 Answers2

2

TL;DR (In PYTHON3!!):

import urllib.request
urllib.request.urlretrieve('http://www.maltparser.org/mco/english_parser/engmalt.poly-1.7.mco', 'C:\\Users\\mustufain\\Desktop\\engmalt.poly-1.7.mco')
urllib.request.urlretrieve('http://maltparser.org/dist/maltparser-1.8.1.zip', 'C:\\Users\\mustufain\\Desktop\\maltparser-1.8.1.zip')
zfile = zipfile.ZipFile('C:\\Users\\mustufain\\Desktop\\maltparser-1.8.1.zip')
zfile.extractall('C:\\Users\\mustufain\\Desktop\\maltparser-1.8.1\\')

Then:

from nltk.parse import malt
mp = malt.MaltParser('C:\\Users\\mustufain\\Desktop\\maltparser-1.8.1\\', "C:\\Users\\mustufain\\Desktop\\engmalt.poly-1.7.mco")
mp.parse_one('I shot an elephant in my pajamas .'.split()).tree()
L3viathan
  • 26,748
  • 2
  • 58
  • 81
alvas
  • 115,346
  • 109
  • 446
  • 738
1

If all the download and environmental variable setup is done correctly, most probably it's how file/dir path are splitted in the nltk.parse.malt.py, at https://github.com/nltk/nltk/blob/develop/nltk/parse/malt.py#L69 that splits the directory and filename specifically for linux:

def find_maltparser(parser_dirname):
    """
    A module to find MaltParser .jar file and its dependencies.
    """
    if os.path.exists(parser_dirname): # If a full path is given.
        _malt_dir = parser_dirname
    else: # Try to find path to maltparser directory in environment variables.
        _malt_dir = find_dir(parser_dirname, env_vars=('MALT_PARSER',))
    # Checks that that the found directory contains all the necessary .jar
    malt_dependencies = ['','','']
    _malt_jars = set(find_jars_within_path(_malt_dir))
    _jars = set(jar.rpartition('/')[2] for jar in _malt_jars)
    malt_dependencies = set(['log4j.jar', 'libsvm.jar', 'liblinear-1.8.jar'])

    assert malt_dependencies.issubset(_jars)
    assert any(filter(lambda i: i.startswith('maltparser-') and i.endswith('.jar'), _jars))
    return list(_malt_jars)

The bug has been fixed and in the process of merging at https://github.com/nltk/nltk/pull/1292

Changing this line:

_jars = set(jar.rpartition('/')[2] for jar in _malt_jars)

to this should solve your problem =)

_jars = set(os.path.split(jar)[1] for jar in _malt_jars)

For the answer not related to the code itself but how you have setup the environment variables or downloaded and saved the malt parser directories or files, see https://github.com/nltk/nltk/issues/1294

alvas
  • 115,346
  • 109
  • 446
  • 738
  • It didnot work I have changed the line in malt.py and restart it, it still gives me an assertion erorr when i load the malt-parser – Mustufain Feb 20 '16 at 11:22
  • It gives me this new assertion error after changing the line in malt.py: assert any(filter(lambda i: i.startswith('maltparser-') and i.endswith('.jar'), _jars)) AssertionError – Mustufain Feb 20 '16 at 11:27
  • It is throwing exception at this line : malt_dependencies = set(['log4j.jar', 'libsvm.jar', 'liblinear-1.8.jar']) – Mustufain Feb 20 '16 at 12:00