2
python3 -m nltk.downloader -d /usr/local/share/nltk_data all

Upon running the above command in GCP, I face the following RuntimeWarning

'nltk.downloader' found in sys.modules after import of package 'nltk', but prior to execution of 'nltk.downloader'; this may result in unpredictable behaviour

Also I've already installed nltk. And whilst installing that, I had to add the --proxy switch. I'm assuming I'd require a proxy switch over here too, so there's a chance that might be the problem over here. But I'm unaware as to how I can add the proxy switch here.

Tony Stark
  • 511
  • 2
  • 15

1 Answers1

1

I created a GCE instance to install NLTK and test the command you included, and it showed the same warning message. According to this issue in the NLTK Github repository, the warning message appears due to how the modules are imported in the __init__.py file. The NLTK developer explains that refactoring the modules would break backwards compatibility. The warning message appears to be a known issue and harmless when referencing the note here.

In case you would like to hide the warning message, so you can use the -W switch from Python like so:

python3 -W ignore -m nltk.downloader -d ./nltk.data all

Another option is to manually import NLTK first in a Python one-liner script using the -c switch (the argument is the package to download, which can be referenced in their documentation):

python3 -c “import nltk; nltk.download(‘all’)”

RUN python3 -c “import nltk; nltk.download(‘all’)” # When running from a Dockerfile

I tested both options in a GCE instance, and did not see the warning message. This documentation is external to GCP so I cannot vouch for their accuracy.

ErnestoC
  • 2,660
  • 1
  • 6
  • 19