0

Here is the code

%%time

xrange=range

print ("Cleaning and parsing the tweets...\n")

clean_tweet_texts = []

for i in xrange(nums[0],nums[1]):

    if( (i+1)%10000 == 0 ):

       print( "Tweets %d of %d has been processed" % ( i+1, nums[1] )) 

    clean_tweet_texts.append(tweet_cleaner(df_tweet['text'][i]))

And the error total message :

Cleaning and parsing the tweets...


---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<timed exec> in <module>()

~\Anaconda3\lib\site-packages\pandas\core\series.py in __getitem__(self, key)
    599         key = com._apply_if_callable(key, self)
    600         try:
--> 601             result = self.index.get_value(self, key)
    602 
    603             if not is_scalar(result):

~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_value(self, series, key)
   2475         try:
   2476             return self._engine.get_value(s, k,
-> 2477                                           tz=getattr(series.dtype, 'tz', None))
   2478         except KeyError as e1:
   2479             if len(self) > 0 and self.inferred_type in ['integer', 'boolean']:

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_value()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_value()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

KeyError: 859

Not able to decipher it can anybody help on this

Selcuk
  • 57,004
  • 12
  • 102
  • 110

1 Answers1

1

This error can occur due to many reasons but most probably it is because the dataframe's index does not contain the number you are indexing with (maybe you dropped that row). I was able to get the exact same error traceback with following code.

>>> df = pd.DataFrame(np.random.random((10,3)))
>>> df.drop(3,axis=0,inplace=True)
>>> for i in range(10):
...     print(df[0][i])
...

error:

0.49022637034634586
0.5626132827030591
0.09118872448782767
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/home/dhananjay/.conda/lib/python3.7/site-packages/pandas/core/series.py", line 1071, in __getitem__
    result = self.index.get_value(self, key)
  File "/home/dhananjay/.conda/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 4730, in get_value
    return self._engine.get_value(s, k, tz=getattr(series.dtype, "tz", None))
  File "pandas/_libs/index.pyx", line 80, in pandas._libs.index.IndexEngine.get_value
  File "pandas/_libs/index.pyx", line 88, in pandas._libs.index.IndexEngine.get_value
  File "pandas/_libs/index.pyx", line 131, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 992, in pandas._libs.hashtable.Int64HashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 998, in pandas._libs.hashtable.Int64HashTable.get_item
KeyError: 3