Stanford Dependency Parser using NLTK on Python very Slow

Question

path_to_jar = path + 'stanford-parser-full-2015-12-09/stanford-parser.jar'
path_to_models_jar = path + 'stanford-parser-full-2015-12-09/stanford-parser-3.6.0-models.jar'
sentence='This is a nice phone 4 me'

print 'Loading module'
start= time.time()
dependency_parser = StanfordDependencyParser(path_to_jar=path_to_jar, path_to_models_jar=path_to_models_jar)
print time.time()-start

start=time.time()
result = dependency_parser.raw_parse((sentence))
print result
print time.time()-start

I've been working on Dependency Parsing using Stanford and NLTK. The probelm I face is the execution time. Here is the output for the above code

Loading module
0.0047550201416
<listiterator object at 0x10bbcbb10>
3.65611600876

It takes approximately 4 seconds per sentence/text. In java, using static variables to load module, its super fast. Any suggestions? At this rate it will take me 100 hours to train provided no error occurs!

You're conflating parser load time with parsing time. Try using `raw_parse_sents` when parsing multiple sentences. — alvas, Oct 11 '16 at 03:52
That helped, thank-you! Have you come across Assertion Errors while execution. One of my datasets got caught, and none of the solution like decoding signed-utf-8 works. — Nachiappan Chockalingam, Oct 17 '16 at 09:26

Stanford Dependency Parser using NLTK on Python very Slow

0 Answers0