15

Does PyPy work with NLTK, and if so, is there an appreciable performance improvement, say for the bayesian classifier?

While we're at it, do any of the other python environments (shedskin, etc) offer better nlkt performance than cpython?

Parand
  • 102,950
  • 48
  • 151
  • 186
  • I don't know about pypy (and reading teh pypy FAQ I suspect it would require work to get it to work); but I've wondered roughly the same but with IronPython. – winwaed Dec 10 '10 at 14:36
  • You should remove the answer to your question and insert it as an answer. – Tim McNamara Jan 31 '11 at 09:03

3 Answers3

5

At least some of NLTK does work with PyPy and there is some performance gain, according to someone on #pypy on freenode. Have you run any tests? Just download PyPy from pypy.org/download.html and instead of "time python yourscript.py data.txt" type "time pypy yourscript.py data.txt".

TryPyPy
  • 6,214
  • 5
  • 35
  • 63
  • I did try nltk with PyPy 1.4.0, ran into a couple of python library path and import issues, fixed those, ran into more, gave up before getting it to work. I'll likely give it a try again, but wanted to see if anyone had already had success. – Parand Dec 30 '10 at 09:33
  • 1
    Comment I received via email: The main issues are: 1. PyPy implements Python 2.5. This means adding "from __future__ import with_statement" here and there, rewriting usages of property.setter, and fixing up new in 2.6 library calls like os.walk. 2. NLTK needs PyYAML. Simply symlinking (or copying) stuffs to pypy-1.4/site-packages work. – Parand Dec 30 '10 at 19:21
  • 2
    2.7 support is on its way, probably will land in January, but you can get (then translate/compile it yourself) the current state of 2.7 support by: "hg clone https://bitbucket.org/pypy/pypy && hg up fast-forward". – TryPyPy Dec 30 '10 at 20:15
4

I got a response via email (Seo, please feel free to respond here) that said:

The main issues are:

PyPy implements Python 2.5. This means adding "from future import with_statement" here and there, rewriting usages of property.setter, and fixing up new in 2.6 library calls like os.walk.

NLTK needs PyYAML. Simply symlinking (or copying) stuffs to pypy-1.4/site-packages work.

And:

Do you have NLTK running with PyPy, and if so are you seeing performance improvements?

Yes, and yes.

So apparently NLTK does run with PyPy and there are performance improvements.

Parand
  • 102,950
  • 48
  • 151
  • 186
3

You can run nltk with pypy now. There's a benchmark under pypy 1.8, although later releases (currently pypy 2.0 is the latest) will perform better still. nltk runs its unit tests under pypy these days, so the nltk developers are ensuring it works.

Wilfred Hughes
  • 29,846
  • 15
  • 139
  • 192