Questions tagged [wordsegment]

5 questions
1
vote
0 answers

Merge word segments that are overlapping or contained within other segments python

I am working on a project that involves word segmentation in images containing handwritten text. I am using the scale space technique for word segmentation for this. One problem is the overlapping segments as shown in the picture: I want to merge…
1
vote
1 answer

How to use segment() from wordsegment inside to re.sub to extract words from hashtags in python?

I am working on sentiment analysis of tweets using python. In the process of cleaning of tweets, I want to extract words from hashtags. I found that wordsegment library does this work very efficiently. However my issue is that, wordsegment library…
dVinay
  • 7
  • 5
1
vote
1 answer

wordsegement python library: ValueError: max() arg is an empty sequence

I am using wordsegment python library to tokenize my text as follows: from wordsegment import load, segment tweet = 'Sobering stats: 110,000 homes worth $20B in flood-affected areas in Baton Rouge region, #lawx ... via…
sareem
  • 429
  • 1
  • 8
  • 23
0
votes
1 answer

jieba segmenter applied to "content" column and then create new column "words" with separated characters in r

I am trying to segment chinese sentences from column "content" into words using jieba package in r, and then create a new corresponding column "words" where each row contains the segmented words of the corresponding rows from previous "content"…
hongpastry
  • 121
  • 1
  • 9
0
votes
2 answers

Text Segmentation using Python package of wordsegment

Folks, I am using python library of wordsegment by Grant Jenks for the past couple of hours. The library works fine for any incomplete words or separating combined words such as e nd ==> end and thisisacat ==> this is a cat. I am working on the…
Saurabh Gokhale
  • 53,625
  • 36
  • 139
  • 164