1

I am using wordsegment python library to tokenize my text as follows:

from wordsegment import load, segment
tweet = 'Sobering stats: 110,000 homes worth $20B in flood-affected areas in Baton Rouge region, #lawx 
 ... via @theadvocatebr'
print(segment(tweet))

However, I'm getting a weird error that I couldn't understand not fix:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-a4734f82b340> in <module>
      1 from wordsegment import load, segment
      2 tweet = 'Sobering stats: 110,000 homes worth $20B in flood-affected areas in Baton Rouge region, #lawx via @theadvocatebr'
----> 3 print(segment(tweet))

~\Anaconda3\lib\site-packages\wordsegment\__init__.py in segment(self, text)
    165     def segment(self, text):
    166         "Return list of words that is the best segmenation of `text`."
--> 167         return list(self.isegment(text))
    168 
    169 

~\Anaconda3\lib\site-packages\wordsegment\__init__.py in isegment(self, text)
    151         for offset in range(0, len(clean_text), size):
    152             chunk = clean_text[offset:(offset + size)]
--> 153             _, chunk_words = search(prefix + chunk)
    154             prefix = ''.join(chunk_words[-5:])
    155             del chunk_words[-5:]

~\Anaconda3\lib\site-packages\wordsegment\__init__.py in search(text, previous)
    138                     yield (prefix_score + suffix_score, [prefix] + suffix_words)
    139 
--> 140             return max(candidates())
    141 
    142         # Avoid recursion limit issues by dividing text into chunks, segmenting

ValueError: max() arg is an empty sequence

I'm using the following on Windows 10:

  • Python3
  • anaconda3
  • wordsegemt==1.3.0

Any hints on how to resolve this? is it a library bug?

sareem
  • 429
  • 1
  • 8
  • 23

1 Answers1

2

Please add "load()" after "from wordsegment import load, segment" This worked for me.

Screenshot

Faisal Jawad
  • 79
  • 1
  • 7