5

In iPython console I typed from nltk.book import and I got several LookupErrors. Below shows the code I got.

*** Introductory Examples for the NLTK Book ***
Loading text1, ..., text9 and sent1, ..., sent9
Type the name of the text or sentence to view it.
Type: 'texts()' or 'sents()' to list the materials.
---------------------------------------------------------------------------
LookupError Traceback (most recent call last)
<ipython-input-3-8446809acbd4> in <module>()
 ----> 1 from nltk.book import*

C:\Users\dell\Anaconda\lib\site-packages\nltk-3.0.3-py2.7.egg\nltk\book.py in <module>()
 20 print("Type: 'texts()' or 'sents()' to list the materials.")
 21 
---> 22 text1 = Text(gutenberg.words('melville-moby_dick.txt'))
 23 print("text1:", text1.name)
 24 

 C:\Users\dell\Anaconda\lib\site-packages\nltk-3.0.3-py2.7.egg\nltk\corpus\util.pyc in __getattr__(self, attr)
 97             raise AttributeError("LazyCorpusLoader object has no attribute '__bases__'")
 98 
 ---> 99         self.__load()
100         # This looks circular, but its not, since __load() changes our
101         # __class__ to something new:

 C:\Users\dell\Anaconda\lib\site-packages\nltk-3.0.3-py2.7.egg\nltk\corpus\util.pyc in __load(self)
 62             except LookupError as e:
 63                 try: root = nltk.data.find('corpora/%s' % zip_name)
 ---> 64                 except LookupError: raise e
 65 
 66         # Load the corpus.

 LookupError: 
 **********************************************************************
 Resource u'corpora/gutenberg' not found.  Please use the NLTK
 Downloader to obtain the resource:  >>> nltk.download()
 Searched in:
- 'C:\\Users\\dell/nltk_data'
- 'C:\\nltk_data'
- 'D:\\nltk_data'
- 'E:\\nltk_data'
- 'C:\\Users\\dell\\Anaconda\\nltk_data'
- 'C:\\Users\\dell\\Anaconda\\lib\\nltk_data'
- 'C:\\Users\\dell\\AppData\\Roaming\\nltk_data'
**********************************************************************

In [4]: 

Can i know why I get these errors?

Dakshila Kamalsooriya
  • 1,391
  • 4
  • 17
  • 36
  • 3
    try using `nltk.dowwnload()` this would open a panel. From there in `Corpora` download `gutenberrg` book/corpora and try your command again – Vaulstein Jun 18 '15 at 07:49
  • Thank you! It worked! Downloaded corpora that is mentioned in the error message. Will I needed the other corpora, which were not downloaded again later? – Dakshila Kamalsooriya Jun 18 '15 at 08:14

4 Answers4

5

Your missing the Gutenberg corpora in nltk.book, hence the error. The error is self descriptive.

You need to use nltk.download() to download the corpora. enter image description here

Once the corpora is downloaded, re-run your command and check if the error comes up again. If it does, it would be for another corpora. Download that corpora too.

from nltk.book import * is not the preferred method, it is advisable to only import the corpora which you would be using in your code. You could use from nltk.corpus import gutenberg instead.

See reference on link

Vaulstein
  • 20,055
  • 8
  • 52
  • 73
  • Thanks! I downloaded only the corpora that appears each time in the error. Will I need other corpora again as well? – Dakshila Kamalsooriya Jun 18 '15 at 08:12
  • It depends on your application and its usage. As you read through the nltk book you would realize that you won't require most of the corpus. Some of the Corpus which are required are `brown, treebank, wordnet, words, conll2000, conll2002, ieer, gutenberg` – Vaulstein Jun 18 '15 at 08:19
3

As the NLTK book says, the way to prepare for working with the book is to open up the nltk.download() pop-up, turn to the tab "Collections", and download the "Book" collection. Do it and you can read the rest of the book with no surprises.

Incidentally you can do the same from the python console, without the pop-ups, by executing nltk.download("book")

alexis
  • 48,685
  • 16
  • 101
  • 161
0

Seems it searches for the data only at specific places (like mentioned in the error description). Try copying the content of nltk into one of those directories (or create one) such as D:\nltk_data This solved the issue for me (because the error would continue to show up even if the Guttenber was already downloaded since it did not find it at that place)

An excerpt from the error you get: (these are the directories among which you can choose where to place the nltk content so that it can be found)

  • 'C:\Users\dell/nltk_data'
  • 'C:\nltk_data'
  • 'D:\nltk_data'
  • 'E:\nltk_data'
  • 'C:\Users\dell\Anaconda\nltk_data'
  • 'C:\Users\dell\Anaconda\lib\nltk_data'
  • 'C:\Users\dell\AppData\Roaming\nltk_data'
Marek S.
  • 108
  • 14
-1

Maybe you should download the nltk_data package in the following directory:

Screenshot 1

Johnny Bones
  • 8,786
  • 7
  • 52
  • 117